Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial ordering by reldep and reldep-trackfeatures for conda #457

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

wolfv
Copy link
Contributor

@wolfv wolfv commented Jun 28, 2021

I went back to the drawing board and now came up with this solution for the problem of preferring variants for conda packages.

There are two additional sorts:

  • by trackfeature of a given dependency selection
  • by evr of a given dependency selection

For a given two solvables that are "variants" (same version & build number), this code compares all dependencies. If the dependency specifier between the two solvables is different, it will resolve the first level of deps to check:

  • which one selects the higher dependency version (evr)
  • which one exclusively selects packages that have a track_feature

For example:

numpy-1.20-py3.8

  • python >=3.8,<3.9 --> this dep wins over the other two variants since it selects the higher python version
    numpy-1.20-py3.7
  • python >=3.7,<3.8 --> this numpy wins over the pypy3.7 variant since it selects a python without track feature
    numpy-1.20-pypy3.7
  • python >=3.7,<3.8 *pypy --> this goes to the bottom since all python packages that are selected here have a track_feature

I would be happy to get some tips on how to slim down the code further :) specifically I was thinking about using the selection_solvables and selection_filter functions but didn't really know if they'd be efficient at doing what I needed.

One of the corner cases I tried to solve was when we have the same reldep->name twice, e.g. the warpx package depends on

warpx-21.05-py36ha753790_0.tar.bz2
- openpmd-api * nompi_*
- openpmd-api 0.13.3.*

For this, I am trying here to select the matching solvables for both reldep strings, then compute the intersection of the two sets.

From the matching solvables I compare if they (all) have track features, and finally I find the best EVR of all matching solvables and return it.

Then I check the second list of dependencies, and compare both "best EVRs" and existence of track_features.

src/policy.c Outdated Show resolved Hide resolved
src/policy.c Outdated

int i = 0, j = 0;
// set intersection, assuming sorted arrays
if (prev)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assumes that the Id's returned by pool_whatprovides are sorted... is that correct @mlschroe ?

@wolfv
Copy link
Contributor Author

wolfv commented Jun 30, 2021

Hi @mlschroe I am reasonably confident in this PR now (and it definitely improves the situation on the mamba/conda side). It would be great to get your review and improvement suggestions. I am sure things can be further simplified with more in-depth knowledge of libsolv.

Some test cases:

  • mamba install rubin-env --> does not select mpich = = external anymore
  • mamba install numpy --> now selects numpy + python 3.9
  • mamba install boost gazebo --> now selects boost gazebo and python 3.9

@mlschroe
Copy link
Member

I added support for trackfeature minimization in the "trackfeature" branch. This implements what I described in the other pull request, i.e. backtracking if a new tracked feature needs to be added.

Can you please check if that helps with your testcases?

@wolfv
Copy link
Contributor Author

wolfv commented Jun 30, 2021

Hi @mlschroe awesome, yes, it seems to work fine!

The ordering by dependencies seems to be missing though, right? Basically in your branch it "works" by accident since the build strings are often ordererable (e.g. py39hashval > py38hashval...).

However, sometimes that breaks ... (especially when people are pushing a new build without updating the build number, which unfortunately does happen).

E.g. for this combination of install specs: CONDA_SUBDIR=linux-64 micromamba create -n tete --dry-run scipy scikit-learn python=3.6 -c conda-forge the trackfeature branch of yours resolves to libgfortran=7.5.0 and with the branch in this PR it resolves to libgfortran=9.3.0.

But anyways, maybe it makes sense to combine both?

@mlschroe
Copy link
Member

Yes, something like your order by dependency patch is still needed.

@wolfv wolfv force-pushed the order_by_deps_and_trackfeatures_take_two branch from 7ef80f4 to e7c6718 Compare July 7, 2021 10:38
@wolfv
Copy link
Contributor Author

wolfv commented Jul 23, 2021

@mlschroe we've released this code into our conda package and it's been released for a while now. We're not getting any more complaints so I think this would be good to go now :)

Do you want me to reformat anything?

@mlschroe
Copy link
Member

What about my "trackfeature" branch? Is that also live in your released conda package? Can/Should I merge it?

@wolfv
Copy link
Contributor Author

wolfv commented Jul 23, 2021

the track feature branch is not yet live, no. I should probably run more tests with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants