Skip to content

modify nfcore process resources

Sven Willger edited this page Sep 3, 2024 · 1 revision
author date tags
SW 2024-09-03 nextflow, workflow, nf-core, setup, resources

Observation

Nextflow processes fail due to allocation of insufficient memory, CPUs or server time.

Reason for this behavior

There are two possible reasons for this behavior:

  1. The process failed due to insufficient resources and didn't retry because the error code didn't grant a retry (see description and solution for this problem at Modify nf-core workflows retries).
  2. The specified max_memory, max_cpus or max_time values are really too small to finish the process

Solution

As described in setup-nfcore-workflow in the section 2.4 Optional: Modify resources for specific processes you can modify specific processes in the process scope within a separate run.config file.

The resource requirements of nextflow processes are specified in process labels which are defined in the base.config file within the workflow. If a process exits because of lacking resources, Nextflow automatically retries the process with doubled resources until it reaches the specified max_memory, max_cpus or max_time values. Hence, you can increase these parameters and restart the process.

To avoid long runtimes, e.g., due to several retries by nextflow or too low numbers of CPU cores for big datasets, you can also increase the resource requirements for specific processes in the process scope within a separate run.config file.

Each process contains a label that specifies its resource requirements (see nextflow.config file) that can be overwritten, e.g., like this for the high label:

process {
    withLabel:_process with issues_ {
        cpus   = { ( 12    * task.attempt ) }
        memory = { ( 200.GB * task.attempt ) }
        time   = { ( 16.h  * task.attempt ) }
    }
}

In this case the allocated resource value will be multiplied by the attempt of the process.

IMPORTANT NOTE!! Be aware that you can't just copy and paste the check_max() function from base.config in to your new custom configuration file. This won't work outside of the main pipeline config files (see Error when modifying base.config#158)

If you want to address both "reasons for this behavior" (see above ) you can add this code block into your custom run.config:

process {
    errorStrategy = { task.exitStatus in (1..200) ? 'retry' : 'finish' }
    maxRetries    = 5
    
    withLabel:_process with issues_ {
        cpus   = { ( 12    * task.attempt ) }
        memory = { ( 200.GB * task.attempt ) }
        time   = { ( 16.h  * task.attempt ) }
    }
}

More information at tuning-workflow-resources