Replies: 11 comments 31 replies
-
Hey Hartmut(@hg747), I've made 2 modifications to your entry:
You now have to synchronise your repository with the main one in order to have these changes in your own. Here are the automation results: $ ./test_all.sh hgrosser
******** Test ********
===== Hartmut Grosser ======
4256d19d3e134d79cc6f160d428a1d859ce961167bd01ca528daca8705163910 /home/gcarreno/Programming/1brc-ObjectPascal/results/hgrosser.output
4256d19d3e134d79cc6f160d428a1d859ce961167bd01ca528daca8705163910 Official Output Hash
=========== $ ./run_all.sh hgrosser
******** Run ********
===== Hartmut Grosser ======
-- SSD --
Benchmark 1: hgrosser
Time (mean ± σ): 73.390 s ± 0.880 s [User: 71.909 s, System: 1.469 s]
Range (min … max): 71.792 s … 74.810 s 10 runs
=========== These are just preliminary results. The official ones will be posted after the weekly Saturday full run of the automation. Good work!! This is an amazing entry, especially being a single threaded one!! Cheers, |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot for this enjoyable news. |
Beta Was this translation helpful? Give feedback.
-
Hello Gus, thank you for merging. Did you see that the 2nd command line parameter has changed and needs some optimizing? Please have a look in file README.md, chapter "Optimizing the 2nd command line parameter". Please try the mentioned values (maybe in a for-loop). If not too much work for you, I would be interested to see the different time measurements. Thanks a lot. Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hello Gus, I've seen the new results and am glad about mine. Did you optimize my 2nd command line parameter as I asked you for? Please have a look in file README.md, chapter "Optimizing the 2nd command line parameter". Please try the mentioned values (maybe in a for-loop). If not too much work for you, I would be interested to see the different time measurements. Thanks a lot. And which value did you use for the new official results? Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hey Hartmut(@hg747), Results on a not so quiet system: $ ./hgrosser.sh
===== Hartmut Grosser ======
-- 14 --
Benchmark 1: hgrosser-14
Time (mean ± σ): 63.620 s ± 1.023 s [User: 62.311 s, System: 1.295 s]
Range (min … max): 62.628 s … 65.093 s 5 runs
===========
===== Hartmut Grosser ======
-- 15 --
Benchmark 1: hgrosser-15
Time (mean ± σ): 57.649 s ± 0.448 s [User: 56.317 s, System: 1.324 s]
Range (min … max): 57.099 s … 58.290 s 5 runs
===========
===== Hartmut Grosser ======
-- 16 --
Benchmark 1: hgrosser-16
Time (mean ± σ): 52.353 s ± 0.671 s [User: 51.078 s, System: 1.269 s]
Range (min … max): 51.493 s … 52.913 s 5 runs
===========
===== Hartmut Grosser ======
-- 17 --
Benchmark 1: hgrosser-17
Time (mean ± σ): 51.011 s ± 1.317 s [User: 49.722 s, System: 1.286 s]
Range (min … max): 49.731 s … 53.242 s 5 runs
Warning: The first benchmarking run for this command was significantly slower than the rest (53.242 s). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--warmup' option which helps to fill these caches before the actual benchmark. You can either try to increase the warmup count further or re-run this benchmark on a quiet system in case it was a random outlier. Alternatively, consider using the '--prepare' option to clear the caches before each timing run.
===========
===== Hartmut Grosser ======
-- 18 --
Benchmark 1: hgrosser-18
Time (mean ± σ): 49.669 s ± 0.698 s [User: 48.353 s, System: 1.306 s]
Range (min … max): 48.948 s … 50.763 s 5 runs
===========
===== Hartmut Grosser ======
-- 19 --
Benchmark 1: hgrosser-19
Time (mean ± σ): 51.537 s ± 0.813 s [User: 50.209 s, System: 1.313 s]
Range (min … max): 50.496 s … 52.471 s 5 runs
===========
===== Hartmut Grosser ======
-- 20 --
Benchmark 1: hgrosser-20
Time (mean ± σ): 53.155 s ± 0.739 s [User: 51.852 s, System: 1.302 s]
Range (min … max): 52.336 s … 54.081 s 5 runs
===========
===== Hartmut Grosser ======
-- 21 --
Benchmark 1: hgrosser-21
Time (mean ± σ): 55.467 s ± 0.753 s [User: 54.056 s, System: 1.399 s]
Range (min … max): 54.522 s … 56.185 s 5 runs
===========
===== Hartmut Grosser ======
-- 22 --
Benchmark 1: hgrosser-22
Time (mean ± σ): 56.560 s ± 0.530 s [User: 55.232 s, System: 1.321 s]
Range (min … max): 55.919 s … 57.316 s 5 runs
===========
===== Hartmut Grosser ======
-- 23 --
Benchmark 1: hgrosser-23
Time (mean ± σ): 58.742 s ± 0.490 s [User: 57.362 s, System: 1.378 s]
Range (min … max): 57.949 s … 59.261 s 5 runs
===========
===== Hartmut Grosser ======
-- 24 --
Benchmark 1: hgrosser-24
Time (mean ± σ): 60.038 s ± 0.680 s [User: 58.602 s, System: 1.434 s]
Range (min … max): 59.066 s … 60.974 s 5 runs
===========
===== Hartmut Grosser ======
-- 25 --
Benchmark 1: hgrosser-25
Time (mean ± σ): 61.331 s ± 1.518 s [User: 59.726 s, System: 1.602 s]
Range (min … max): 59.546 s … 62.752 s 5 runs
===========
===== Hartmut Grosser ======
-- 26 --
Benchmark 1: hgrosser-26
Time (mean ± σ): 62.185 s ± 1.472 s [User: 60.292 s, System: 1.888 s]
Range (min … max): 60.404 s … 64.012 s 5 runs
===========
===== Hartmut Grosser ======
-- 27 --
Benchmark 1: hgrosser-27
Time (mean ± σ): 64.932 s ± 0.614 s [User: 62.414 s, System: 2.505 s]
Range (min … max): 64.357 s … 65.966 s 5 runs
===========
===== Hartmut Grosser ======
-- 28 --
Benchmark 1: hgrosser-28
Time (mean ± σ): 68.924 s ± 1.050 s [User: 65.100 s, System: 3.815 s]
Range (min … max): 67.564 s … 70.230 s 5 runs
=========== And the From this preliminary run, I'm guessing that 18 hits the sweet spot, right? Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hello Gus, I made a new PR for a new version (1.60), which (hopefully) is a little faster. I'm sorry that this version needs a new optimization run for the 2nd command line parameter (but not so many values as before, only from 16 to 22 please). It would be great, if you could do this optimization run with the old input file (with CR's), because then the timing is comparable (the new input file is smaller = faster reading). I hope, I'm asking not too much. Thanks a lot. Please pay attention, that IF the input file has no more CR's, that then a new Conditional "noCR" must be set (see chapter "How to compile" in readme.md). Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hey Hartmut(@hg747), Fine tuning: $ ./hgrosser.sh
===== Hartmut Grosser ======
-- 16 --
Benchmark 1: hgrosser-16
Time (mean ± σ): 50.980 s ± 0.222 s [User: 49.551 s, System: 1.426 s]
Range (min … max): 50.675 s … 51.292 s 5 runs
===========
===== Hartmut Grosser ======
-- 17 --
Benchmark 1: hgrosser-17
Time (mean ± σ): 49.199 s ± 0.394 s [User: 47.710 s, System: 1.486 s]
Range (min … max): 48.602 s … 49.559 s 5 runs
===========
===== Hartmut Grosser ======
-- 18 --
Benchmark 1: hgrosser-18
Time (mean ± σ): 48.047 s ± 0.722 s [User: 46.565 s, System: 1.479 s]
Range (min … max): 47.027 s … 48.792 s 5 runs
===========
===== Hartmut Grosser ======
-- 19 --
Benchmark 1: hgrosser-19
Time (mean ± σ): 49.529 s ± 0.192 s [User: 48.012 s, System: 1.514 s]
Range (min … max): 49.314 s … 49.784 s 5 runs
===========
===== Hartmut Grosser ======
-- 20 --
Benchmark 1: hgrosser-20
Time (mean ± σ): 51.348 s ± 0.615 s [User: 49.790 s, System: 1.550 s]
Range (min … max): 50.291 s … 51.782 s 5 runs
===========
===== Hartmut Grosser ======
-- 21 --
Benchmark 1: hgrosser-21
Time (mean ± σ): 53.108 s ± 0.513 s [User: 51.619 s, System: 1.485 s]
Range (min … max): 52.207 s … 53.473 s 5 runs
===========
===== Hartmut Grosser ======
-- 22 --
Benchmark 1: hgrosser-22
Time (mean ± σ): 54.413 s ± 0.952 s [User: 52.891 s, System: 1.518 s]
Range (min … max): 53.471 s … 55.762 s 5 runs
=========== hgrosser-22.json I think that we've proven that 18 is still the soft spot. Cheers, |
Beta Was this translation helpful? Give feedback.
-
Don't we have a new results table for last saturday? |
Beta Was this translation helpful? Give feedback.
-
Hello Gus, I want to improve and optimize one of my 2 programs for 400 stations only and full name compares. But I don't know which one, before I know, if my new version with threads works on your computer with 32 threads and how long it needs. As said, it is my very first multi thread program ever. Do you think it's possible that you run the optimization tests I asked you for? For instructions please have a look in README.md. Thanks a lot. Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hey Hartmut(@hg747), $ hyperfine -w 1 -r 5 -N -n 'hgrosser-th' --export-json 'results/hgrosser-th.json' './bin/hgrosser-th /tmp/measurements-1_000_000_000.txt 18'
Benchmark 1: hgrosser-th
Time (mean ± σ): 10.155 s ± 0.514 s [User: 122.707 s, System: 10.017 s]
Range (min … max): 9.863 s … 11.067 s 5 runs $ hyperfine -w 1 -r 5 -N -n 'hgrosser-th-400' --export-json 'results/hgrosser-th-400.json' './bin/hgrosser-th /tmp/measurements-400-1_000_000_000.txt 18'
Benchmark 1: hgrosser-th-400
Time (mean ± σ): 3.083 s ± 0.059 s [User: 45.178 s, System: 2.064 s]
Range (min … max): 3.019 s … 3.145 s 5 runs |
Beta Was this translation helpful? Give feedback.
-
Hey Hartmut(@hg747), Just ran a 2 deep loop in a not so quiet machine. Please have a look at these results and inform me what should be the defaults. $ ./hgrosser.sh
===== Hartmut Grosser ======
-- 32 16 64 --
Benchmark 1: hgrosser-th-32-16-64
Time (mean ± σ): 4.883 s ± 0.081 s [User: 136.897 s, System: 4.060 s]
Range (min … max): 4.792 s … 4.972 s 5 runs
-- 32 16 96 --
Benchmark 1: hgrosser-th-32-16-96
Time (mean ± σ): 5.092 s ± 0.118 s [User: 142.227 s, System: 3.844 s]
Range (min … max): 4.995 s … 5.274 s 5 runs
-- 32 16 128 --
Benchmark 1: hgrosser-th-32-16-128
Time (mean ± σ): 5.143 s ± 0.052 s [User: 144.513 s, System: 3.895 s]
Range (min … max): 5.089 s … 5.199 s 5 runs
-- 32 16 192 --
Benchmark 1: hgrosser-th-32-16-192
Time (mean ± σ): 5.239 s ± 0.055 s [User: 148.639 s, System: 3.977 s]
Range (min … max): 5.173 s … 5.322 s 5 runs
-- 32 16 256 --
Benchmark 1: hgrosser-th-32-16-256
Time (mean ± σ): 5.405 s ± 0.102 s [User: 153.700 s, System: 4.027 s]
Range (min … max): 5.305 s … 5.570 s 5 runs
-- 32 17 64 --
Benchmark 1: hgrosser-th-32-17-64
Time (mean ± σ): 6.052 s ± 0.135 s [User: 168.629 s, System: 4.103 s]
Range (min … max): 5.811 s … 6.117 s 5 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
-- 32 17 96 --
Benchmark 1: hgrosser-th-32-17-96
Time (mean ± σ): 6.106 s ± 0.110 s [User: 170.842 s, System: 3.972 s]
Range (min … max): 5.959 s … 6.220 s 5 runs
-- 32 17 128 --
Benchmark 1: hgrosser-th-32-17-128
Time (mean ± σ): 6.079 s ± 0.102 s [User: 168.348 s, System: 3.724 s]
Range (min … max): 5.983 s … 6.239 s 5 runs
-- 32 17 192 --
Benchmark 1: hgrosser-th-32-17-192
Time (mean ± σ): 6.315 s ± 0.113 s [User: 178.316 s, System: 3.860 s]
Range (min … max): 6.180 s … 6.429 s 5 runs
-- 32 17 256 --
Benchmark 1: hgrosser-th-32-17-256
Time (mean ± σ): 6.471 s ± 0.095 s [User: 182.470 s, System: 3.852 s]
Range (min … max): 6.333 s … 6.592 s 5 runs
-- 32 18 64 --
Benchmark 1: hgrosser-th-32-18-64
Time (mean ± σ): 6.934 s ± 0.126 s [User: 192.952 s, System: 4.145 s]
Range (min … max): 6.776 s … 7.122 s 5 runs
-- 32 18 96 --
Benchmark 1: hgrosser-th-32-18-96
Time (mean ± σ): 7.197 s ± 0.183 s [User: 204.000 s, System: 4.022 s]
Range (min … max): 6.974 s … 7.343 s 5 runs
-- 32 18 128 --
Benchmark 1: hgrosser-th-32-18-128
Time (mean ± σ): 7.198 s ± 0.127 s [User: 200.920 s, System: 3.881 s]
Range (min … max): 6.991 s … 7.314 s 5 runs
-- 32 18 192 --
Benchmark 1: hgrosser-th-32-18-192
Time (mean ± σ): 7.290 s ± 0.077 s [User: 206.812 s, System: 3.786 s]
Range (min … max): 7.167 s … 7.349 s 5 runs
-- 32 18 256 --
Benchmark 1: hgrosser-th-32-18-256
Time (mean ± σ): 7.605 s ± 0.168 s [User: 214.971 s, System: 3.915 s]
Range (min … max): 7.332 s … 7.784 s 5 runs
-- 32 19 64 --
Benchmark 1: hgrosser-th-32-19-64
Time (mean ± σ): 7.825 s ± 0.119 s [User: 223.646 s, System: 4.332 s]
Range (min … max): 7.638 s … 7.950 s 5 runs
-- 32 19 96 --
Benchmark 1: hgrosser-th-32-19-96
Time (mean ± σ): 7.906 s ± 0.138 s [User: 225.110 s, System: 4.156 s]
Range (min … max): 7.663 s … 7.996 s 5 runs
-- 32 19 128 --
Benchmark 1: hgrosser-th-32-19-128
Time (mean ± σ): 7.950 s ± 0.134 s [User: 227.847 s, System: 4.099 s]
Range (min … max): 7.786 s … 8.077 s 5 runs
-- 32 19 192 --
Benchmark 1: hgrosser-th-32-19-192
Time (mean ± σ): 8.079 s ± 0.122 s [User: 230.096 s, System: 3.966 s]
Range (min … max): 7.890 s … 8.216 s 5 runs
-- 32 19 256 --
Benchmark 1: hgrosser-th-32-19-256
Time (mean ± σ): 8.336 s ± 0.127 s [User: 239.489 s, System: 3.975 s]
Range (min … max): 8.181 s … 8.500 s 5 runs
-- 32 20 64 --
Benchmark 1: hgrosser-th-32-20-64
Time (mean ± σ): 8.352 s ± 0.060 s [User: 240.793 s, System: 4.584 s]
Range (min … max): 8.251 s … 8.395 s 5 runs
-- 32 20 96 --
Benchmark 1: hgrosser-th-32-20-96
Time (mean ± σ): 8.421 s ± 0.117 s [User: 243.560 s, System: 4.509 s]
Range (min … max): 8.295 s … 8.561 s 5 runs
-- 32 20 128 --
Benchmark 1: hgrosser-th-32-20-128
Time (mean ± σ): 8.460 s ± 0.199 s [User: 241.307 s, System: 4.391 s]
Range (min … max): 8.136 s … 8.672 s 5 runs
-- 32 20 192 --
Benchmark 1: hgrosser-th-32-20-192
Time (mean ± σ): 8.485 s ± 0.139 s [User: 242.975 s, System: 4.414 s]
Range (min … max): 8.271 s … 8.605 s 5 runs
-- 32 20 256 --
Benchmark 1: hgrosser-th-32-20-256
Time (mean ± σ): 8.787 s ± 0.201 s [User: 253.167 s, System: 4.345 s]
Range (min … max): 8.498 s … 8.987 s 5 runs
-- 32 21 64 --
Benchmark 1: hgrosser-th-32-21-64
Time (mean ± σ): 8.850 s ± 0.126 s [User: 251.338 s, System: 5.494 s]
Range (min … max): 8.643 s … 8.985 s 5 runs
-- 32 21 96 --
Benchmark 1: hgrosser-th-32-21-96
Time (mean ± σ): 8.839 s ± 0.094 s [User: 254.251 s, System: 5.404 s]
Range (min … max): 8.733 s … 8.927 s 5 runs
-- 32 21 128 --
Benchmark 1: hgrosser-th-32-21-128
Time (mean ± σ): 8.897 s ± 0.118 s [User: 254.693 s, System: 5.263 s]
Range (min … max): 8.790 s … 9.029 s 5 runs
-- 32 21 192 --
Benchmark 1: hgrosser-th-32-21-192
Time (mean ± σ): 9.077 s ± 0.111 s [User: 259.670 s, System: 5.230 s]
Range (min … max): 8.965 s … 9.243 s 5 runs
-- 32 21 256 --
Benchmark 1: hgrosser-th-32-21-256
Time (mean ± σ): 9.185 s ± 0.114 s [User: 263.032 s, System: 5.226 s]
Range (min … max): 9.053 s … 9.274 s 5 runs
-- 32 22 64 --
Benchmark 1: hgrosser-th-32-22-64
Time (mean ± σ): 9.277 s ± 0.185 s [User: 260.456 s, System: 7.713 s]
Range (min … max): 9.108 s … 9.563 s 5 runs
-- 32 22 96 --
Benchmark 1: hgrosser-th-32-22-96
Time (mean ± σ): 9.275 s ± 0.112 s [User: 260.175 s, System: 7.744 s]
Range (min … max): 9.125 s … 9.423 s 5 runs
-- 32 22 128 --
Benchmark 1: hgrosser-th-32-22-128
Time (mean ± σ): 9.226 s ± 0.098 s [User: 260.527 s, System: 7.499 s]
Range (min … max): 9.125 s … 9.353 s 5 runs
-- 32 22 192 --
Benchmark 1: hgrosser-th-32-22-192
Time (mean ± σ): 9.520 s ± 0.124 s [User: 270.825 s, System: 7.602 s]
Range (min … max): 9.395 s … 9.716 s 5 runs
-- 32 22 256 --
Benchmark 1: hgrosser-th-32-22-256
Time (mean ± σ): 9.570 s ± 0.100 s [User: 267.501 s, System: 7.585 s]
Range (min … max): 9.482 s … 9.719 s 5 runs
=========== hgrosser-th-32-22-256.json Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hey Hartmut(@hg747),
This will be your very own discussion so we can discuss it further.
Welcome to the family 😄
Cheers,
Gus
Beta Was this translation helpful? Give feedback.
All reactions