-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more tools from Galaxy Australia TPV config #69
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @cat-bro!
I'm not sure how to review those properly. I hope our cgroups2 setup is soon back and we can contribute numbers.
I tried to flag a few tools that IMHO would waste a lot of resources in the happy-case and where a size-based rule might be good to have.
Thanks again!
tools.yml
Outdated
@@ -293,6 +305,8 @@ tools: | |||
mem: 12 | |||
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_multijoin_tool/.*: | |||
mem: 8 | |||
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/.*: | |||
mem: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like way too much for the general case. The tool is not doing much afaik.
Maybe we should include here a rule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for reviewing! This rarely uses above the default 3.8GB and only when the inputs are large, so I'll add a rule here.
tools.yml
Outdated
@@ -399,6 +421,8 @@ tools: | |||
toolshed.g2.bx.psu.edu/repos/devteam/cuffquant/cuffquant/.*: | |||
cores: 6 | |||
mem: 20 | |||
toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS.*: | |||
mem: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks also very generous. Most tools will not need more than a few 100 MB for normal files sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For needle this needs more than 3.8GB fairly regularly for smallish inputs so I'll target this tool ID on its own. As a general rule it will only need extra memory for large input sizes.
@@ -618,10 +664,15 @@ tools: | |||
toolshed.g2.bx.psu.edu/repos/galaxyp/eggnog_mapper/eggnog_mapper/.*: | |||
cores: 8 | |||
mem: 12 | |||
toolshed.g2.bx.psu.edu/repos/galaxyp/eggnog_mapper/eggnog_mapper_search/.*: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this in place.
toolshed.g2.bx.psu.edu/repos/galaxyp/eggnog_mapper/eggnog_mapper/.*:
cores: 8
mem: 24
env:
EGGNOG_DBMEM: "--dbmem"
params:
singularity_run_extra_arguments: "--env EGGNOG_DBMEM=--dbmem"
scheduling:
require:
- singularity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bgruening ! That's a separate tool ID and I'll try this out on Australia. eggnog_mapper was very popular a year ago and is still used regularly. Is there a difference between {params: {singularity_run_extra_arguments: "--env EGGNOG_DBMEM=--dbmem"}} and {env: {SINGULARITYENV_EGGNOG_DBMEM: dbmem}}? We have been using environment variables prefixed with SINGULARITYENV_
and right now I can't remember where I got that from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the eggnog mapper readme Admins can choose to set the environment variable ``EGGNOG_DBMEM=--dbmem`` which will copy the complete EggNOG annotation DB into memory which is usually much faster than using multiple cores (but needs approx. 37GB of RAM).
tools.yml
Outdated
@@ -1090,6 +1154,8 @@ tools: | |||
cores: 7 | |||
toolshed.g2.bx.psu.edu/repos/iuc/bcftools_norm/bcftools_norm/.*: | |||
cores: 2 | |||
toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools.*: | |||
mem: 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bedtools are used a lot and most of the time 4GB is enough as it seems?
Maybe a rule here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This varies a lot between tools. genomecoveragebed regularly exceeds 3.8 with no input, intersectbed is known to need lots. I might leave this out for now and come back with tool-by-tool estimates.
tools.yml
Outdated
@@ -1265,8 +1353,13 @@ tools: | |||
toolshed.g2.bx.psu.edu/repos/iuc/gemini_load/gemini_load/.*: | |||
cores: 10 | |||
mem: 40 | |||
toolshed.g2.bx.psu.edu/repos/iuc/getorganelle/get_organelle_from_reads/.*: | |||
mem: 70 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks strange, but its just a gut feeling :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is also hard to generalise. For inputs less that 0.8GB we might be safe with 24GB but it's a steep increase after that. We have jobs running out of memory on Galaxy Australia with 4GB of input - there must be other variables at play. Jobs that try to use too much memory are killed, so we're better off with the local rule in this case.
18689c3
to
7e34546
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cat-bro
Some tools that have been added to AU's TPV config in the last year or so