Merge pull request #24 from avantonder/dsl2

Dsl2
avantonder · Jan 31, 2024 · a64b5b2 · a64b5b2
2 parents c7f9659 + 52fddc4
commit a64b5b2
Show file tree

Hide file tree

Showing 8 changed files with 306 additions and 322 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -3,6 +3,16 @@
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## v1.2 - [30/01/24]
+
+- Remove `--brackendb` parameter as redundant. Bracken will now use the database location specified with `--krakendb`.
+- Documentation updated.
+
+## v1.1 - [16/03/23]
+
+- Fix check_samplesheet.py bug
+- Various other fixes
+
 ## v1.0 - [15/11/22]
 
 - Initial release of avantonder/bacQC, created with the [nf-core](https://nf-co.re/) template.
diff --git a/README.md b/README.md
@@ -65,7 +65,6 @@ Alternatively the samplesheet.csv file created by nf-core/fetchngs can also be u
         -profile <docker/singularity/podman/conda/institute> \
         --input samplesheet.csv \
         --kraken2db minikraken2_v1_8GB \
-        --brackendb minikraken2_v1_8GB \
         --genome_size 4300000 \
         --outdir <OUTDIR>
     ```
@@ -77,7 +76,6 @@ Alternatively the samplesheet.csv file created by nf-core/fetchngs can also be u
         -profile <docker/singularity/podman/conda/institute> \
         --input samplesheet.csv \
         --kraken2db minikraken2_v1_8GB \
-        --brackendb minikraken2_v1_8GB \
         --genome_size 4300000 \
         --kraken_extract \
         --tax_id <TAXON_ID> \

diff --git a/docs/parameters.md b/docs/parameters.md
@@ -9,8 +9,7 @@ Define where the pipeline should find input data and save output data.
 | Parameter | Description | Type | Default | Required | Hidden |
 |-----------|-----------|-----------|-----------|-----------|-----------|
 | `input` | Path to comma-separated file containing information about the samples in the experiment. <details><summary>Help</summary><small>You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. </small></details>| `string` |  |  |  |
-| `kraken2db` |  | `string` | None |  |  |
-| `brackendb` |  | `string` | None |  |  |
+| `kraken2db` | Path to Kraken 2 database | `string` | None |  |  |
 | `outdir` | The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure. | `string` |  |  |  |
 | `email` | Email address for completion summary. <details><summary>Help</summary><small>Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the 
 workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run.</small></details>| `string` |  |  |  |

diff --git a/docs/usage.md b/docs/usage.md
@@ -22,9 +22,6 @@ Use the `--input` parameter to specify the location of `samplesheet.csv`. It has
 --input '[path to samplesheet file]'
 ```
 
-```console
---input '[path to samplesheet file]'
-```
 ### Full samplesheet
 
 The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below.
@@ -47,11 +44,10 @@ An [example samplesheet](../assets/samplesheet.csv) has been provided with the p
 
 ## Kraken 2 database
 
-The pipeline can be provided with a path to a Kraken 2 database which is used, along with Bracken, to assign sequence reads to a particular taxon.  Use the `--kraken2db` and `--brackendb` parameters to specify the location of the Kraken 2 database:
+The pipeline can be provided with a path to a Kraken 2 database which is used, along with Bracken, to assign sequence reads to a particular taxon.  Use the `--kraken2db` parameter to specify the location of the Kraken 2 database:
 
 ```console
 --kraken2db '[path to Kraken 2 database]'
---brackendb '[path to Kraken 2 database]'
 ```
 
 The Kraken 2 and Bracken steps can by skipped by specifying the `--skip_kraken2` parameter.
@@ -73,7 +69,6 @@ nextflow run avantonder/bacQC \
   --input samplesheet.csv \
   -profile singularity \
   --kraken2db path/to/kraken2/dir \
-  --bracken path/to/kraken2/dir/ \
   --genome_size <ESTIMATED GENOME SIZE> \
   --outdir <OUTDIR> \
   -resume
@@ -102,7 +97,7 @@ nextflow pull avantonder/bacQC
 
 It's a good idea to specify a pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since.
 
-First, go to the [avantonder/bacQC releases page](https://github.com/avantonder/bacQC/releases) and find the latest version number - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`.
+First, go to the [avantonder/bacQC releases page](https://github.com/avantonder/bacQC/releases) and find the latest version number - numeric only (eg. `1.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.1`.
 
 This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future.
 
@@ -118,22 +113,17 @@ Several generic profiles are bundled with the pipeline which instruct the pipeli
 
 > We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported.
 
-The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to see if your system is available in these configs please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation).
-
 Note that multiple profiles can be loaded, for example: `-profile test,docker` - the order of arguments is important!
 They are loaded in sequence, so later profiles can overwrite earlier profiles.
 
 If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended.
 
 * `docker`
   * A generic configuration profile to be used with [Docker](https://docker.com/)
-  * Pulls software from Docker Hub: [`nfcore/bacqc`](https://hub.docker.com/r/nfcore/bacqc/)
 * `singularity`
   * A generic configuration profile to be used with [Singularity](https://sylabs.io/docs/)
-  * Pulls software from Docker Hub: [`nfcore/bacqc`](https://hub.docker.com/r/nfcore/bacqc/)
 * `podman`
   * A generic configuration profile to be used with [Podman](https://podman.io/)
-  * Pulls software from Docker Hub: [`nfcore/bacqc`](https://hub.docker.com/r/nfcore/bacqc/)
 * `conda`
   * Please only use Conda as a last resort i.e. when it's not possible to run the pipeline with Docker, Singularity or Podman.
   * A generic configuration profile to be used with [Conda](https://conda.io/docs/)
@@ -168,10 +158,6 @@ process {
 
 See the main [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html) for more information.
 
-If you are likely to be running `nf-core` pipelines regularly it may be a good idea to request that your custom config file is uploaded to the `nf-core/configs` git repository. Before you do this please can you test that the config file works with your pipeline of choice using the `-c` parameter (see definition above). You can then create a pull request to the `nf-core/configs` repository with the addition of your config file, associated documentation file (see examples in [`nf-core/configs/docs`](https://github.com/nf-core/configs/tree/master/docs)), and amending [`nfcore_custom.config`](https://github.com/nf-core/configs/blob/master/nfcore_custom.config) to include your custom profile.
-
-If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack) on the [`#configs` channel](https://nfcore.slack.com/channels/configs).
-
 ### Running in the background
 
 Nextflow handles job submissions and supervises the running jobs. The Nextflow process must run until the pipeline is finished.

diff --git a/modules/local/samplesheet_check.nf b/modules/local/samplesheet_check.nf
@@ -1,8 +1,9 @@
 process SAMPLESHEET_CHECK {
     tag "$samplesheet"
 
-    executor 'local'
-    memory 100.MB
+    label 'process_low'
+    //executor 'local'
+    //memory 100.MB
 
     conda (params.enable_conda ? "conda-forge::python=3.8.3" : null)
     container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?

diff --git a/nextflow.config b/nextflow.config
@@ -10,11 +10,11 @@
 params {
 
   // Input options
-  input                             = null
+  input                      = null
+  genome_size                = null
 
   // Databases
-  kraken2db                         = null
-  brackendb                         = null
+  kraken2db                  = null
 
   // MultiQC options
   multiqc_config             = null
@@ -171,7 +171,7 @@ manifest {
   description = 'Pipeline for running QC on bacterial sequence data'
   mainScript = 'main.nf'
   nextflowVersion = '!>=22.04.3'
-  version = '1.0'
+  version = '1.2'
 }
 
 // Load modules.config for DSL2 module specific options