-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MUMmer4/MUMmer3 versions reported differently/SLURM job limits #325
Comments
Hi, @genomesandMGEs . Thanks for your interest in Based on the error, and the dependency output you have provided, I believe the issue is that If you are unsure whether you have installed which nucmer If this does not return a file location, then |
Hi @baileythegreen, many thanks for the quick reply. I have nucmer installed (v4.0.0rc1); have installed mummer with conda. Running I tried following your suggestion and ran Traceback (most recent call last): |
Hi @genomesandMGEs, It looks like the issue might be the move to MUMmer4 - what's failing there is the check for the reported version number, which appears to be reported differently by the new MUMmer version. If you're working in a conda environment, the quick solution would be to step back to MUMmer3 until that check gets updated in pyani. But that version check is something we need to see to. Cheers, L. |
Hi @widdowquinn, thanks for the reply. So, I create a new conda env, cloned the repository and installed pyani with |
I see what's happening there - the check for dependencies is baulking because you don't have the SGE scheduler installed locally. I think we were on top of this with #232 and #276 but the PR hasn't been merged, yet. @baileythegreen - might you please be able to take a quick look at the conflict? |
@widdowquinn, I actually think the issue is a bit different. @genomesandMGEs To clarify: When you ran it in the new environment with The scheduler option is primarily intended for use on compute clusters, so if you are trying to use (assuming you just ran the command you sent originally) Try running this: pyani anim -i . -o genomes_ANIm If you have a database file with the default name already created, this should work. (If you don't, it'll give you a pretty arcane SQL error.) Let me know if that works. |
Thanks you two. Yes @baileythegreen, I ran with the I'm now running the command you suggested, and the same for the command I ran before using the |
@genomesandMGEs, you're right; running it locally for ~2k bacterial genomes probably isn't feasible. As far as using SGE on your WSL machine, this isn't a question of a workaround to make it functional, it's a question of installing the SGE scheduler. This is theoretically possible, but is not something I have any experience with. There are various sites with instructions, though: search results. However, I must stress that installing a scheduler on a local machine is not going to help. A scheduler's purpose is to efficiently assign computing tasks to individual nodes in a large computing cluster, such as a supercomputer. If you don't have a cluster, a scheduler can't do anything to help speed computation. It just allocates resources. I'm sorry I don't have better news for you. |
Glad to be of help. There are a few different issues here, so I'll try to take them in turn.
|
Many thanks to you both for trying to help me with this. @baileythegreen |
From conversations here, it seems like - to avoid issues with overloading the SLURM queue - the recommendation is to have each array task handle multiple jobs, and submit a single array job (which will be limited, e.g. to 10k tasks). The individual tasks run longer (because there are more comparisons per task), but there is then less overhead on the scheduler. With an array of 10k tasks, and ≈4m comparisons, that would mean 400 comparisons per task. This is likely the model we will go with for the backend (as the total number of comparisons gets large). It might even be convenient for our plans regarding asynchronous population of the database in version 3. |
@widdowquinn Please advise as to whether we want to keep this issue open, rename it, create a new issue, et cetera given that the current topic of discussion is no longer at all related to the issue title. |
Summary:
There seems to be a problem with
pyani anim
Description:
When running
pyani anim --scheduler SGE -i . -o genomes_ANIm
on my collection of ~2k genomes, I get an AttributeError (please see below). I tried to run this locally, witout the--scheduler
, and I get the same error. Thanks for looking into this!Current Output:
Traceback (most recent call last):
File "/home/jbotelho/anaconda3/bin/pyani", line 11, in
load_entry_point('pyani', 'console_scripts', 'pyani')()
File "/home/jbotelho/pyani/pyani/scripts/pyani_script.py", line 117, in run_main
returnval = args.func(args)
File "/home/jbotelho/pyani/pyani/scripts/subcommands/subcmd_anim.py", line 168, in subcmd_anim
nucmer_version = anim.get_version(args.nucmer_exe)
File "/home/jbotelho/pyani/pyani/anim.py", line 110, in get_version
version = match.group() # type: ignore
AttributeError: 'NoneType' object has no attribute 'group'
pyani Version:
0.3.0
installed dependencies
System information
Platorm==Linux-4.4.0-19041-Microsoft-x86_64-with-debian-buster-sid
Python==3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]
Installed pyani Python dependendencies...
Pillow==6.2.0 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
biopython==1.78 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
matplotlib==3.2.1 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
namedlist==1.8 (/home/jbotelho/anaconda3/lib/python3.7/site-packages/namedlist-1.8-py3.7.egg)
networkx==2.3 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
numpy==1.17.2 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
openpyxl==3.0.0 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
pandas==0.25.1 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
scipy==1.6.3 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
seaborn==0.9.0 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
sqlalchemy==1.3.9 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
tqdm==4.36.1 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
Installed pyani development dependendencies...
bandit==Not Installed (-)
black==Not Installed (-)
codecov==Not Installed (-)
coverage==Not Installed (-)
doc8==Not Installed (-)
flake8==Not Installed (-)
jinja2==2.10.3 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
mypy==Not Installed (-)
pydocstyle==Not Installed (-)
pylint==2.4.2 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
pytest==5.2.1 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
pytest-cov==Not Installed (-)
sphinx==2.2.0 (/home/jbotelho/anaconda3/lib/python3.7/site-packages)
Installed pyani pip-install dependendencies...
pre-commit==Not Installed (-)
pytest-ordering==Not Installed (-)
sphinx-rtd-theme==Not Installed (-)
Installed third-party tool versions...
blast+==Linux_2.9.0+
Traceback (most recent call last):
File "/home/jbotelho/anaconda3/bin/pyani", line 11, in
load_entry_point('pyani', 'console_scripts', 'pyani')()
File "/home/jbotelho/pyani/pyani/scripts/pyani_script.py", line 117, in run_main
returnval = args.func(args)
File "/home/jbotelho/pyani/pyani/scripts/subcommands/subcmd_listdeps.py", line 86, in subcmd_listdeps
for tool, version in get_tool_versions():
File "/home/jbotelho/pyani/pyani/dependencies.py", line 117, in get_tool_versions
yield (name, func())
File "/home/jbotelho/pyani/pyani/anim.py", line 110, in get_version
version = match.group() # type: ignore
AttributeError: 'NoneType' object has no attribute 'group'
Python Version:
3.7.4
Operating System:
WSL
The text was updated successfully, but these errors were encountered: