-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs(docs): update performance section
- Loading branch information
1 parent
fe7151f
commit f05ec38
Showing
1 changed file
with
164 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,167 @@ | ||
# Performance | ||
|
||
We will briefly discuss the performance in this page. | ||
|
||
## Linear Algebra Properties on Typed Matrices | ||
|
||
Package `LinearAlgebra.jl` provides several linear algebra operations. By utilizing the Julia type system, we can also improve the performance of these operations. For example, the `issymmetric` function defaults to call the `issymmetric` and check each element. The matrix `Minij` is explicitly known to be symmetric. The following example shows that the `issymmetric` function on the `Minij` typed matrix spent **10.310 ns** and **873.400 μs** on the `Matrix` typed matrix. | ||
|
||
```julia-repl | ||
julia> a = Minij(1000) | ||
1000×1000 Minij{Int64}: | ||
... | ||
julia> b = Matrix(Minij(1000)) | ||
1000×1000 Matrix{Int64}: | ||
... | ||
julia> @benchmark issymmetric(a) | ||
BenchmarkTools.Trial: 10000 samples with 999 evaluations. | ||
Range (min … max): 9.810 ns … 89.790 ns ┊ GC (min … max): 0.00% … 0.00% | ||
Time (median): 10.310 ns ┊ GC (median): 0.00% | ||
Time (mean ± σ): 10.798 ns ± 2.083 ns ┊ GC (mean ± σ): 0.00% ± 0.00% | ||
█▅▇▅▆▆▃▄▄▂ ▃▂▁▃ ▁▂▁▂▂▂▁▁▂ ▁ ▂ | ||
███████████████████████████▇▆▇▇▇▆▆▄▆▃▄▆▄▄▅▃▅▅▅▄▆▄▄▅▄▄▅▅▄▅▂▅ █ | ||
9.81 ns Histogram: log(frequency) by time 17.7 ns < | ||
Memory estimate: 0 bytes, allocs estimate: 0. | ||
julia> @benchmark issymmetric(b) | ||
BenchmarkTools.Trial: 4883 samples with 1 evaluation. | ||
Range (min … max): 593.700 μs … 13.507 ms ┊ GC (min … max): 0.00% … 0.00% | ||
Time (median): 873.400 μs ┊ GC (median): 0.00% | ||
Time (mean ± σ): 1.009 ms ± 515.315 μs ┊ GC (mean ± σ): 0.00% ± 0.00% | ||
█▂ ▁▁ | ||
▂▆██▇▆████▇▅▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂ | ||
594 μs Histogram: frequency by time 2.69 ms < | ||
Memory estimate: 0 bytes, allocs estimate: 0. | ||
``` | ||
|
||
## Known Algorithm Working on `Hilbert` | ||
|
||
The following example shows a known algorithm that works on `Hilbert` matrices, `a` is a `Hilbert` typed matrix, and `b` is the same matrix with `Matrix` typed. When doing the `det` operation, the `Hilbert` typed matrix only spent **0.3%** of the time that the normal matrix spent, although the memory usage is **69.02 KiB** and **66.22 MiB** respectively. | ||
|
||
```julia-repl | ||
julia> a = Hilbert{BigFloat}(100) | ||
100×100 Hilbert{BigFloat}: | ||
... | ||
julia> b = Matrix(Hilbert{BigFloat}(100)) | ||
100×100 Matrix{BigFloat}: | ||
... | ||
julia> t3 = @benchmark det(a) | ||
BenchmarkTools.Trial: 6985 samples with 1 evaluation. | ||
Range (min … max): 334.500 μs … 740.291 ms ┊ GC (min … max): 0.00% … 68.80% | ||
Time (median): 564.100 μs ┊ GC (median): 0.00% | ||
Time (mean ± σ): 706.671 μs ± 8.853 ms ┊ GC (mean ± σ): 10.32% ± 0.82% | ||
▅█▅▂▂▅▄▁ | ||
▃█████████▇▆▆▄▄▄▄▅▅▅▅▆▆▇███▇▆▇▅▅▄▅▅▄▃▃▃▃▃▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃ | ||
334 μs Histogram: frequency by time 1.23 ms < | ||
Memory estimate: 69.02 KiB, allocs estimate: 3233. | ||
julia> t4 = @benchmark det(b) | ||
BenchmarkTools.Trial: 32 samples with 1 evaluation. | ||
Range (min … max): 127.925 ms … 229.261 ms ┊ GC (min … max): 8.86% … 4.76% | ||
Time (median): 158.327 ms ┊ GC (median): 9.49% | ||
Time (mean ± σ): 160.932 ms ± 23.576 ms ┊ GC (mean ± σ): 9.48% ± 3.52% | ||
▃▃ █ ▃ ▃█ | ||
▇▁▁▁██▁█▁█▁▇▇▇▁▇▇▇▇██▁▁▁▁▇▇▇▁▁▇▇▁▇▁▁▁▁▇▁▁▁▁▁▁▇▁▁▇▁▁▁▁▁▁▁▁▁▁▁▇ ▁ | ||
128 ms Histogram: frequency by time 229 ms < | ||
Memory estimate: 66.22 MiB, allocs estimate: 1333851. | ||
``` | ||
|
||
## Trade between Performance and Memory | ||
|
||
This fresh approach saves substantial memory by trading off the performance. The following example shows that the `Cauchy` typed matrix `a` only spent **63.229 μs** and **114.16 KiB** memory to generate, while the `Matrix` typed matrix `b` spent **3.862 ms** and **7.74 MiB** memory to generate. Also, the memory usage of `a` is **16 bytes** and **8000040 bytes** for `b`. | ||
|
||
```julia-repl | ||
julia> @benchmark a = Cauchy{Float64}(1000) | ||
BenchmarkTools.Trial: 10000 samples with 1 evaluation. | ||
Range (min … max): 27.100 μs … 191.819 ms ┊ GC (min … max): 0.00% … 99.94% | ||
Time (median): 32.100 μs ┊ GC (median): 0.00% | ||
Time (mean ± σ): 63.229 μs ± 1.919 ms ┊ GC (mean ± σ): 35.05% ± 4.29% | ||
▅█▇▅▄▃▃▂▂▃▂▃▃▅▅▃▁▁ ▁▁ ▂ | ||
███████████████████▇▇███▇▇▆▇▇▇▇▆▆▇████▇▇▆▆▄▄▃▂▂▄▅▅▅▆▆▇▇▆▆▇▅▅ █ | ||
27.1 μs Histogram: log(frequency) by time 125 μs < | ||
Memory estimate: 114.16 KiB, allocs estimate: 36. | ||
julia> @benchmark b = Matrix(Cauchy{Float64}(1000)) | ||
BenchmarkTools.Trial: 1288 samples with 1 evaluation. | ||
Range (min … max): 2.413 ms … 18.386 ms ┊ GC (min … max): 0.00% … 48.63% | ||
Time (median): 3.271 ms ┊ GC (median): 0.00% | ||
Time (mean ± σ): 3.862 ms ± 1.674 ms ┊ GC (mean ± σ): 15.96% ± 19.84% | ||
▂█▄ ▁▂ | ||
████▄▆██▆▅▅▅▃▃▃▃▄▄▄▃▃▃▃▃▃▃▃▃▃▃▃▂▂▂▂▃▃▃▃▃▂▂▂▂▂▃▂▁▂▂▁▂▂▂▂▂▂▂ ▃ | ||
2.41 ms Histogram: frequency by time 9.53 ms < | ||
Memory estimate: 7.74 MiB, allocs estimate: 38. | ||
julia> Base.summarysize(a) | ||
16 | ||
julia> Base.summarysize(b) | ||
8000040 | ||
``` | ||
|
||
This improvement is trade off the performance for memory. When accessing each element of the `Cauchy` typed matrix, more time is needed than the `Matrix` typed matrix, which is expected. This can allow machines with insufficient memory to take longer time to run computations that would have been impossible to run before. | ||
|
||
```julia-repl | ||
julia> @benchmark det(a) | ||
BenchmarkTools.Trial: 111 samples with 1 evaluation. | ||
Range (min … max): 20.537 ms … 353.410 ms ┊ GC (min … max): 0.00% … 90.93% | ||
Time (median): 34.151 ms ┊ GC (median): 0.00% | ||
Time (mean ± σ): 45.104 ms ± 42.894 ms ┊ GC (mean ± σ): 7.81% ± 9.53% | ||
▄█▅▃ ▁ | ||
▅████▇█▁▆▅▁▁▁▅▅▁▁▅▁▁▅▁▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▅ | ||
20.5 ms Histogram: log(frequency) by time 301 ms < | ||
Memory estimate: 7.64 MiB, allocs estimate: 4. | ||
julia> @benchmark det(b) | ||
BenchmarkTools.Trial: 175 samples with 1 evaluation. | ||
Range (min … max): 18.639 ms … 314.529 ms ┊ GC (min … max): 0.00% … 91.89% | ||
Time (median): 26.317 ms ┊ GC (median): 0.00% | ||
Time (mean ± σ): 28.670 ms ± 22.610 ms ┊ GC (mean ± σ): 7.81% ± 8.48% | ||
▂ ▂▂ ██ ▂▂ ▃ ▅ ▅▂ ▃▂ | ||
▅▁▃▇█▇██████▆██▅█▅█▅███████▇▆▁▁▆▆▃▁▆▅▅▅▁▁▁▁▁▁▁▃▅▃▁▁▅▁▁▁▁▃▃▁▃ ▃ | ||
18.6 ms Histogram: frequency by time 44.1 ms < | ||
Memory estimate: 7.64 MiB, allocs estimate: 4. | ||
julia> @benchmark sum(a) | ||
BenchmarkTools.Trial: 3104 samples with 1 evaluation. | ||
Range (min … max): 1.124 ms … 7.772 ms ┊ GC (min … max): 0.00% … 0.00% | ||
Time (median): 1.400 ms ┊ GC (median): 0.00% | ||
Time (mean ± σ): 1.604 ms ± 579.750 μs ┊ GC (mean ± σ): 0.00% ± 0.00% | ||
█▃▁█▄ ▂ | ||
█████▇██▇█▆▃▄▄▃▄▃▄▃▄▃▃▃▃▃▃▃▂▃▃▂▂▂▂▂▂▃▃▂▃▂▂▃▃▃▂▃▂▂▂▂▂▂▂▂▂▂▁▂ ▃ | ||
1.12 ms Histogram: frequency by time 3.56 ms < | ||
Memory estimate: 16 bytes, allocs estimate: 1. | ||
julia> @benchmark sum(b) | ||
BenchmarkTools.Trial: 10000 samples with 1 evaluation. | ||
Range (min … max): 243.900 μs … 2.106 ms ┊ GC (min … max): 0.00% … 0.00% | ||
Time (median): 329.800 μs ┊ GC (median): 0.00% | ||
Time (mean ± σ): 355.504 μs ± 91.684 μs ┊ GC (mean ± σ): 0.00% ± 0.00% | ||
▂▅█▇▅▃▃▂▁▁▁▁ | ||
▁▄██████████████▇▆▆▆▆▆▆▆▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁ ▃ | ||
244 μs Histogram: frequency by time 647 μs < | ||
Memory estimate: 16 bytes, allocs estimate: 1. | ||
``` |