Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes issue 232: Pyani throws an error when it fails to find 3rd-party tool versions (but they are installed) #276

Merged
merged 87 commits into from
Sep 4, 2021

Conversation

baileythegreen
Copy link
Contributor

@baileythegreen baileythegreen commented Jun 6, 2021

Please include a summary of the change and which issue is fixed. Please also include the motivation and context, and note the tests that apply to these changes.

Fixes #232. A user reported Pyani failing to find a version for blastall when pyani listdeps was run, but blastall was installed.

It appeared to be caused by OS-dependent differences in the output of the command to obtain blastall's version. (User was using Linux.)

Just removing the "-version" part of the command line then caused the error on MacOS.

This PR switches the command line used based on OS, and also gently reports failure to obtain a version number for any third-party tool.

Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality not to work as expected)
  • This change requires a documentation update

Action Checklist

  • Work on a single issue/concept (if there are multiple separate issues to address, please use a separate pull request for each)
  • Fork the pyani repository under your own account (please allow write access for repository maintainers)
  • Set up an appropriate development environment (please see CONTRIBUTING.md)
  • Create a new branch with a short, descriptive name
  • Work on this branch
    • style guidelines have been followed
    • new code is commented, especially in hard-to-understand areas
    • corresponding changes to documentation have been made
    • tests for the change have been added that demonstrate the fix or feature works
  • Test locally with pytest -v non-passing code will not be merged
  • Rebase against origin/master
  • Check changes with flake8 and black before submission
  • Commit branch
  • Submit pull request, describing the content and intent of the pull request
  • Request a code review
  • Continue the discussion at Pull requests section in the pyani repository

I tried just removing the `"-version"` part of the commandline build, as 
discussed in #232, but then I get an error on MacOS.
@codecov
Copy link

codecov bot commented Jun 6, 2021

Codecov Report

Merging #276 (2931b4f) into master (d18421c) will increase coverage by 0.16%.
The diff coverage is 81.48%.

@@            Coverage Diff             @@
##           master     #276      +/-   ##
==========================================
+ Coverage   75.66%   75.82%   +0.16%     
==========================================
  Files          52       52              
  Lines        3291     3338      +47     
==========================================
+ Hits         2490     2531      +41     
- Misses        801      807       +6     

@baileythegreen baileythegreen added the bug something isn't working how it should label Jun 7, 2021
Copy link
Owner

@widdowquinn widdowquinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have get_<PROGRAM>_version() return the "could not retrieve version" string.

@baileythegreen
Copy link
Contributor Author

Are you going to say the same thing about #218? I get that handling errors at a lower level is potentially desirable, but so is not repeating code if we don't have to (like, if we wanted to handle it differently depending on the tool).

I don't know if there may also be more errors we will want to handle at this particular spot in the code, but if we handle them in get_<tool>_version(), whatever error handling we do there will need to be added into new third-party tools, as well.

@widdowquinn
Copy link
Owner

Are you going to say the same thing about #218?

Yes.

I don't know if there may also be more errors we will want to handle at this particular spot in the code, but if we handle them in get_<tool>_version(), whatever error handling we do there will need to be added into new third-party tools, as well.

More than one approach could make sense here, but here's my reasoning:

  • Catch an AttributeError in dependencies.py assumes that an AttributeError is thrown, but this is not defined behaviour - so we'd want to ensure that called methods do throw an AttributeError, which is an extra definition of expected behaviour and still requires editing of individual get_<PROGRAM>_version() functions.

  • With similar effort, we could standardise responses along the lines of:

    • no executable found: raise response 1
    • executable found but version not retrieved: raise response 2

and then test specifically for those responses. This is potentially clearer and self-documenting.

@baileythegreen
Copy link
Contributor Author

Okay, well, I will still need to catch an error at the lower level. flake8 doesn't like bare except:s, so can I catch the AttributeError (and FileNotFoundError for 218, there) or do you want me to do that a different way?

The nucmer/BLAST+/legacy BLAST `get_version()` methods now return
useful strings for the following events:

- no executable at passed path
- non-executable file at passed path
- no version info returned
- version info (path to executable) [intended to mimic library output]

The blastall get_version() also catches cases where the blastall
executable exists, but is not compiled for the current OS. This is
now the case for macOS 11+ with legacy blastall executables
@widdowquinn
Copy link
Owner

I've added some cases that catch the various types of error we might encounter. Here I've chosen to test for the specific circumstances of the kind of error we might encounter, to aid with diagnostics. This helps when the same class of error (e.g. OSError) might be thrown by more than one kind of event. This code catches:

  • no file at path
  • file at path, but not executable
  • executable doesn't return a version we can use
  • returned version

Also, there's a new issue since macOS 11 in that the compiled legacy BLAST executable won't run on Big Sur and later. The added code catches OSError and returns a string noting that the executable is there, but we can't run it.

I've also modified the "success" string so it has a similar format to that returned for the libraries. See what you think.

@widdowquinn
Copy link
Owner

widdowquinn commented Jun 16, 2021

Re an earlier conversation - this is an occasion where squashing commits can be useful. There's no reason to keep cd3997d, 94c71d8, 57e5097 as separate commits. They are conceptually "atomic" - code changes, docstring changes, and a logger message all for the "same thing." As they are consecutive, we can squash these into a single commit which will have a new hash, and will combine the commit messages. See, e.g. https://gitbetter.substack.com/p/how-to-squash-git-commits

This can be done with the "squash and merge" option above, though it's not fine-grained and I think it would squash all the commits in the PR. It doesn't change the code, really - just keeps logs tidy and short.

@baileythegreen
Copy link
Contributor Author

Looks good. It definitely goes against the "it's better to ask for forgiveness, than permission", Python philosophy, but it's more informative for the user, so it can't be bad.

I've gone and added the equivalent code to the fastANI branch, so that doesn't get overlooked later.

baileythegreen added a commit that referenced this pull request Jun 16, 2021
@widdowquinn
Copy link
Owner

The EAFP approach is fine a lot of the time, but sometimes there are side effects, and sometimes it contradicts the "explicit is better than implicit" guideline. Here I think it's better to be explicit ("this is what the problem is") than implicit ("there's an attribute error, and one of the several ways in which this could have failed happened").

It's a judgement call.

@baileythegreen
Copy link
Contributor Author

Experimenting with mocking to test the new get_version() checks in anib.py. The test is passing, so I guess it works, but I'll see what the coverage looks like.

If everything looks good, I would plan on doing the equivalent things for the other subcommands.

@baileythegreen
Copy link
Contributor Author

anib.py coverage now looks good. Will work on the other files.

@baileythegreen
Copy link
Contributor Author

@widdowquinn

Can you share the output (traceback and all) that you get if you try to run pyani blastall on your newer version of macOS? I'm trying to write a test for that bit, but so far failing to mock it properly, and think seeing the actual behaviour might help.

Specifically, I'm having trouble getting an OSError to be raised.

@baileythegreen
Copy link
Contributor Author

Nevermind! As often happens, immediately after requesting help/insight/what have you, the solution has become apparent. (Or, rather, the reason the solution wasn't quite working has.)

New commits to follow shortly.

Each test case is now its own test. I tried combining them together, but 
have not found the proper way to do this with monkeypatch.
And make error messages match
@baileythegreen
Copy link
Contributor Author

I suspect you'll want some of the things in conftest.py to be renamed or moved around, but I'll leave them as they are for now.

@widdowquinn
Copy link
Owner

Can you share the output (traceback and all) that you get if you try to run pyani blastall on your newer version of macOS?

Sadly, no - legacy BLAST is an unmaintained compiled binary and does not run on MacOS 11.

widdowquinn and others added 20 commits September 3, 2021 15:55
Fix was using Path object directly, not just the
filename (which discards the path information).

Could alternatively have built or converted expected
list into filenames only.

Also turns the list of Path objects into a set for
a slight speed up (addresses a TODO comment).
Should avoid any one folder contents scaling with N squared.
Following one user's recent confusion I have combined the first phrase of the Readme with the 'About' description for the repo to make the intended domain clear(er).
The nucmer/BLAST+/legacy BLAST `get_version()` methods now return
useful strings for the following events:

- no executable at passed path
- non-executable file at passed path
- no version info returned
- version info (path to executable) [intended to mimic library output]

The blastall get_version() also catches cases where the blastall
executable exists, but is not compiled for the current OS. This is
now the case for macOS 11+ with legacy blastall executables
And make error messages match
@widdowquinn widdowquinn merged commit 075568f into master Sep 4, 2021
@widdowquinn widdowquinn deleted the issue_232 branch September 4, 2021 19:14
@baileythegreen
Copy link
Contributor Author

I'm restoring the branch for this because I had not yet pulled the more recent changes, and I want to study the tests.

@baileythegreen baileythegreen restored the issue_232 branch September 4, 2021 19:22
@baileythegreen baileythegreen deleted the issue_232 branch September 4, 2021 19:24
@widdowquinn
Copy link
Owner

They should be on master after merging… (e.g. https://github.com/widdowquinn/pyani/blob/master/tests/test_aniblastall.py)

baileythegreen added a commit that referenced this pull request Sep 4, 2021
@baileythegreen baileythegreen removed the PR of Supreme Importance The PR Bailey really, really wants merged right now label Sep 6, 2021
@baileythegreen
Copy link
Contributor Author

In writing tests for PR #335, I have begun to question this bit of code:

 nucmer_path = Path(shutil.which(nucmer_exe))  # type:ignore

 if nucmer_path is None:
    return f"{nucmer_exe} is not found in $PATH"

I don't think if nucmer_path is None is ever going to evaluate to True, or even run (because None is not what shutil.which() returns when it fails to find a file). If I set the nucmer_exe value in pyani_config.py to an arbitrary value (e.g., NUCMER_DEFAULT = Path("ehfjmoijh")), then try to run pyani listdeps, what I get is:

Traceback (most recent call last):
  File "/Users/baileythegreen/Software/miniconda3/bin/pyani", line 11, in <module>
    load_entry_point('pyani', 'console_scripts', 'pyani')()
  File "/Users/baileythegreen/Software/pyani/pyani/scripts/pyani_script.py", line 126, in run_main
    returnval = args.func(args)
  File "/Users/baileythegreen/Software/pyani/pyani/scripts/subcommands/subcmd_listdeps.py", line 86, in subcmd_listdeps
    for tool, version in get_tool_versions():
  File "/Users/baileythegreen/Software/pyani/pyani/dependencies.py", line 117, in get_tool_versions
    yield (name, func())
  File "/Users/baileythegreen/Software/pyani/pyani/anim.py", line 113, in get_version
    nucmer_path = Path(shutil.which(nucmer_exe))  # type:ignore
  File "/Users/baileythegreen/Software/miniconda3/lib/python3.8/pathlib.py", line 1038, in __new__
    self = cls._from_parts(args, init=False)
  File "/Users/baileythegreen/Software/miniconda3/lib/python3.8/pathlib.py", line 679, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/Users/baileythegreen/Software/miniconda3/lib/python3.8/pathlib.py", line 663, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I propose modifying this to the following:

    try:
        nucmer_path = Path(shutil.which(nucmer_exe))  # type:ignore
        # if nucmer_path is None:
    except TypeError:
        return f"{nucmer_exe} is not found in $PATH"

which then allows pyani listdeps to exit gracefully (and informatively):

Installed third-party tool versions...
	blast+==Darwin_2.6.0+ (/Users/baileythegreen/Software/miniconda3/bin/blastn)
	nucmer==ehfjmoijh is not found in $PATH
	blastall==Darwin_2.2.26 (/Users/baileythegreen/Software/miniconda3/bin/blastall)

I will incorporate these changes into #335, where I am adding other tests for get_version().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working how it should
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pyani listdeps broken :: pyani/aniblastall.py :: blastall doe not have --version option
3 participants