Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it work without specifying GTF file #322

Merged
merged 3 commits into from
May 2, 2024

Conversation

grst
Copy link
Member

@grst grst commented Apr 30, 2024

Wanted to run cellranger multi with just specifying the cellranger index. Likely affected other workflows, too.

@grst grst requested a review from fmalmeida April 30, 2024 13:55
Copy link

github-actions bot commented Apr 30, 2024

nf-core lint overall result: Failed ❌

Posted for pipeline commit b660d19

+| ✅ 168 tests passed       |+
#| ❔   4 tests were ignored |#
!| ❗   4 tests had warnings |!
-| ❌   2 tests failed       |-

❌ Test failures:

  • files_unchanged - docs/images/nf-core-scrnaseq_logo_light.png does not match the template
  • files_unchanged - docs/images/nf-core-scrnaseq_logo_dark.png does not match the template

❗ Test warnings:

  • nextflow_config - Config manifest.version should end in dev: 2.6.0
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.13.1
  • Run at 2024-05-02 08:00:49

Comment on lines -31 to -32
ch_genome_fasta = params.genome ? file( getGenomeAttribute('fasta'), checkIfExists: true ) : []
ch_gtf = params.genome ? file( getGenomeAttribute('gtf'), checkIfExists: true ) : []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it required to be removed from the main.nf. Was it it the "main"-template functions that were complaining about not providing a fasta?

If that is the case, I would say this is a ticket that should be taken entirely at once, in a separate one, because it would affect all of the workflows.

Thus having a single issue to take a look at this matter for all workflows would be easier to track.

Issues that could be optionally merged together for it are: #313 and #277 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #313 is already part of the cellranger multi PR.

I wasn't aware that there's a linting check for that, but I'd argue that as it is currently implemented is suboptimal. Either

  • all parameters should be evaluated outside the scrnaseq workflow and passed as arguments
  • all parameters should be evaluated within the workflow and not arguments should be passed.

Currently it is just inconsistent with fastq/gtf being passed as arguments and to me it's highly confusing that different parts of the workflow partly evaluate some parameters. It also doesn't help because you anyway can't include the workflow somewhere else and run it as it is currently implemented.

For the sake of simplicitly I'd vote for sticking with what we currently have (i.e. evaluate parameters within the workflow) and if necessary ignore the respective linting checks. The solution intended by the template authors would probably be to move all parameter checks outside the workflow and pass them as arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linting doesn't seem to fail because of this btw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, every other subworkflow currently, have the input assertion happening inside the sub-workflow itself:

e.g. alevin

assert (genome_fasta && gtf && salmon_index && txp2gene) || (genome_fasta && gtf)  || (genome_fasta && gtf && transcript_fasta && txp2gene):
        """Must provide a genome fasta file ('--fasta') and a gtf file ('--gtf'), or a genome fasta file
        and a transcriptome fasta file ('--transcript_fasta`) if no index and txp2gene is given!""".stripIndent()

However, they are indeed seeming to rely on gtf/fasta that had been file() loaded outside. So, we should also bring the loaders to the inside the sub-workflow on the others as well.

Is that the correct understanding?

Copy link
Member Author

@grst grst May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think the subworkflows (for each aligner) should be fully abstracted (i.e. not rely on params anywhere, but just consume input channels/values via take).

It's just about whether to evaluate params in main.nf or in workflows/scrnaseq.nf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah!,
Now I can see it clearly that the changes were happening in these two.
For some reason I was reading as if they were included inside the cellrangermulti sub-workflow.
🤯

@fmalmeida fmalmeida merged commit 5cdb691 into 247-support-for-10x-ffpe-scrna May 2, 2024
11 of 12 checks passed
@fmalmeida fmalmeida deleted the fix-without-gtf branch May 2, 2024 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants