Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threaded sparse matrix - vector multiplication #74

Closed
BacAmorim opened this issue Jul 7, 2020 · 7 comments
Closed

Threaded sparse matrix - vector multiplication #74

BacAmorim opened this issue Jul 7, 2020 · 7 comments

Comments

@BacAmorim
Copy link

This might be useful for KPM calculations:
https://github.com/jagot/ThreadedSparseArrays.jl

While not as efficient as a specialized method as in #66, it might be of use.

@pablosanjose
Copy link
Owner

pablosanjose commented Jul 7, 2020

Hey @BacAmorim, thanks for the pointer! Yes, we had that package in our radar. @fernandopenaranda has been looking hard into paralellizing KPM, and has found an excellent approach using MKLSparse.jl. The multithreaded mul! calls to MKL are incredibly efficient, and scale nicely with number of threads. He's able to gain almost a 10x performance boost over a naive Base.Threads approach, and even over a custom KPM-specific mul! that computes moments at the same time as does the matrix-vector multiply. I actually wonder now if any hand-tuned matrix-free approach as in KITE could actually be as performant as the MKL libraries, so the interest of #66 for me is currently lower than it was. Note that matrix-free approaches have a disadvantage in that they need to apply any non-periodic element of the system on the fly for each multiplication, unlike when you first build your sparse matrix in memory. The key issue that determines the ideal strategy is whether you are really memory limited or not.

@pablosanjose
Copy link
Owner

Regarding this, with #72 plus the recently merged JuliaSparse/MKLSparse.jl#20 you should automatically get multithreaded KPM just by doing using MKLSparse

@pablosanjose
Copy link
Owner

In any case, let me also say that ThreadedSparseArrays.jl is very impressive. It shows how to use Base.Threads efficiently to get pretty close to what MKL does using pure Julia. Some comparisons using Julia 1.6 with four threads

julia> sp = sprand(ComplexF64, 10^6,10^6, 10^-5); v = rand(ComplexF64, 10^6); spt = ThreadedSparseMatrixCSC(sp);

OpenBLAS (default)

julia> @btime sp'v;
  99.862 ms (3 allocations: 15.26 MiB)

ThreadedSparseArrays.jl

julia> @btime spt'v;
  29.970 ms (39 allocations: 15.26 MiB)

MKLSparse.jl

julia> @btime sp'v;
  22.143 ms (3 allocations: 15.26 MiB)

@BacAmorim
Copy link
Author

BacAmorim commented Jul 7, 2020

Note that matrix-free approaches have a disadvantage in that they need to apply any non-periodic element of the system on the fly for each multiplication, unlike when you first build your sparse matrix in memory. The key issue that determines the ideal stratefy is whether you are really memory limited or not.

I agree: memory being the limiting factor or not seems to be the key here. I am of the opinion that if memory is the limiting factor, you might be investing your time in the wrong problem :p

Those comparison's between ThreadedSparseArrays.jl and MKLSparse.jl seems pretty cool! But MKLSparse still seems the best option when it comes to performance. I only see two reasons why a pure Julia version could be desirable:

  1. Not having to install and configure MKL
  2. ThreadedSparseArrays.jl uses @spawn which enables the new smart composable parallelism of Julia: otherwise one might have competition between julia and mkl threads (but that can also happen with blas threads...)

@pablosanjose
Copy link
Owner

Not having to install and configure MKL

Soon you won't have to :-) : JuliaSparse/MKLSparse.jl#22
(The magic of BinaryBuilder and Artifacts!)

new smart composable parallelism

Yes, this one is a big deal. Perhaps enough to add a dependency to ThreadedSparseArrays.jl in Quantica. Of course the ideal would be to have it in Base: JuliaLang/julia#29525

@BacAmorim
Copy link
Author

Soon you won't have to :-) : JuliaSparse/MKLSparse.jl#22
(The magic of BinaryBuilder and Artifacts!)

That is pretty sweet!

@pablosanjose
Copy link
Owner

JuliaSparse/MKLSparse.jl#22 has been merged. I'll be closing this for now, as it is arguably solved by MKLSparse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants