Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependencies to achieve better performance #25

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Alexsandruss
Copy link

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

Target of this pull request is to change dependencies from GNU OpenMP and OpenBLAS to LLVM OpenMP and MKL on Linux to improve Faiss performance like it's done in original "pytorch" conda channel recipe.

Small performance comparison of dependencies:

Number of neighbors Number of rows Number of columns searching time (stock: gomp + openblas), s searching time (llvm-omp + mkl), s Speedup with llvm-omp + mkl
5 10000 16 0.407 0.202 2.01
5 10000 64 0.427 0.178 2.40
5 20000 16 1.262 0.567 2.23
5 20000 64 1.262 0.572 2.20
5 50000 16 7.371 3.521 2.09
5 50000 64 7.438 3.444 2.16
32 10000 16 0.504 0.253 1.99
32 10000 64 0.485 0.255 1.90
32 20000 16 1.509 0.726 2.08
32 20000 64 1.484 0.730 2.03
32 50000 16 7.767 3.693 2.10
32 50000 64 8.058 3.773 2.14

OS: CentOS Linux 7
SW: Python 3.8.6, numpy 1.19.5, faiss 1.6.5, MKL 2020.4, LLVM-OpenMP 11.0.1, GNU OpenMP 9.3.0, OpenBLAS 3.9.0.
HW: Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz, 2 sockets X 28 cores per socket, 376 GB of RAM.

Benchmark script:

import numpy as np
import faiss
from timeit import default_timer as timer

def measure_function(f, *args):
    times = []
    for i in range(5):
        t0 = timer()
        f(*args)
        t1 = timer()
        times.append(t1 - t0)
    return sum(times) / len(times)

rows_range = [10000, 20000, 50000]
columns_range = [16, 64]
k_range = [5, 32]

print("k,rows,cols,time[s]")
for k in k_range:
    for rows in rows_range:
        for cols in columns_range:
            data = np.random.uniform(size=(rows * 2, cols)).astype(np.float32)
            train_data, test_data = data[:rows,:], data[rows:,:]
            index = faiss.IndexFlatL2(cols)
            index.add(train_data)
            time = measure_function(index.search, test_data, k)
            print("{},{},{},{}".format(k, rows, cols, round(time, 6)))

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@isuruf
Copy link
Member

isuruf commented Jan 26, 2021

MKL can be selected at install time. See https://conda-forge.org/docs/maintainer/knowledge_base.html#switching-blas-implementation

@h-vetinari
Copy link
Member

I'd be fine with either of:

  • using track_features to weigh down non-MKL
  • prioritizing mkl via buildnumber bump like e.g. here
  • adding a post-link script (c.f. e.g. here) with a message that recommends installing the mkl-version of blas/lapack.
    @jakirkham also suggested this a while ago, but I haven't gotten around to it yet.

Don't think it's possible to switch the openmp-implementation in the environment like for blas, right @isuruf?

@jakirkham
Copy link
Member

@jakirkham
Copy link
Member

As conda-forge already allows users to choose between different BLAS and OpenMP implementations, I'm not sure I see the need to restrict this on a feedstock basis. Instead we should focus on educating users on these options they have available and help them make informed choices about those options based on their needs

@h-vetinari
Copy link
Member

@Alexsandruss
Thanks again for trying to improve the packaging here! 🙃

Would you like to try any of the options I suggested? "Pinning" blas/openmp directly is not the best approach, as explained by the various comments in this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants