Skip to content

modify nfcore workflow retries

Sven Willger edited this page Sep 3, 2024 · 2 revisions
author date tags
SW 2024-09-03 nextflow, workflow, nf-core, setup

Observation

Nextflow workflows won't restart/retry after a failed process. This also happens if the process ran out of memory or went over the defined time limit, but won't restart with increased limits although process was designed to allocate {x * task.attempt}.

Reason for this behavior

The base.config for each nf-core workflow is coded as follows:

    errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' }
    maxRetries    = 1
    maxErrors     = '-1'

This means that only if the process fails due to the exitStatus (=error code) 104 or 130-145 the process will retry once (maxRetries = 1), every other exitStatus results in aborting the entire workflow ('finish'). Running out of memory/time raises error codes outside this range and therefore no retries with increased memory/time will be started.

Solution

As described in setup-nfcore-workflow in the section 2.4 Optional: Modify resources for specific processes you can modify specific processes in the process scope within a separate run.config file.

To allow all exit codes to trigger a retry enter the following section into the run.config.

process {
    errorStrategy = { task.exitStatus in (1..200) ? 'retry' : 'finish' }
    maxRetries    = 5
}

with this setting all exit codes within 1-200 trigger a retry. In total 5 retries will be performed. If there are still failed process the workflow will be finished. (There are exit codes that make sense not ot retry, e.g. unknown command, permission denied because no matter how often retried they will allways fail. In the future this code block will be updated to ignore those exit codes for retries).

More information at tuning-workflow-resources