From 82a51849d67b3399c259fe633ca5af086703b4b4 Mon Sep 17 00:00:00 2001 From: nathan-lc0 <98367568+nathan-lc0@users.noreply.github.com> Date: Mon, 28 Feb 2022 18:16:25 -0600 Subject: [PATCH 1/5] Update Benchmarks.md --- content/dev/wiki/Benchmarks.md | 46 ++++++++++++++++++++-------------- 1 file changed, 27 insertions(+), 19 deletions(-) diff --git a/content/dev/wiki/Benchmarks.md b/content/dev/wiki/Benchmarks.md index 42d4a6a..bd5af56 100644 --- a/content/dev/wiki/Benchmarks.md +++ b/content/dev/wiki/Benchmarks.md @@ -4,25 +4,33 @@ weight: 500 wikiname: "Benchmarks" # Warning: File is automatically generated from GitHub wiki, do not edit by hand. --- -Run go infinite from start position and abort after depth 26 and report NPS output. +Run `lc0.exe benchmark --nncache=2000000` and report nps output. Please use latest release. -_I put some sample ones from memory. Please put your own bench scores here in sorted NPS order if you can. If you don't know what engine type, gpu is opencl and cpu is openblas_ +Google docs of bench results here. Easier to maintain/prettier? https://docs.google.com/spreadsheets/d/1i4ymeCO7SH1vQ5gS7ZcBChjQaiv1dNItwXwLqvNC-r4/preview -Google docs of bench results here. Easier to maintain/prettier? https://docs.google.com/spreadsheets/d/1lGFf6PLGmBUSMan-YP7Vul4DpRNfn6K8oeCjBILe6uA/edit#gid=0 - -# GPU -| GPU @ stock or OC frequency| Engine version/type | Neural Net size | Username | Speed | +# Ampere Cards +| GPU model | Engine version | Neural Net size | Backend | Speed | +| ------------- | ---- | ------------- | ------------- | ------------- | +|A100 40GB | v0.28.2 | 30x384 | cuda-fp16 | 71560 nps| +|RTX 3090 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| +|RTX 3090 | v0.28.0 | 30x384 | cuda-fp16 | xxx nps| +|RTX 3080 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +|RTX 3070 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +|RTX 3060 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| +|RTX 3060 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +# Turing Cards +| GPU model | Engine version | Neural Net size | Backend | Speed | +| ------------- | ---- | ------------- | ------------- | ------------- | +|Tesla V100 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +|RTX 2080 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +|RTX 2070 | v0.28.0 | 30x384 | cuda-fp16 | xxx nps| +|RTX 2060 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +# CPUs +| CPU model + # of threads | Engine version | Neural Net size | Backend | Speed | | ------------- | ---- | ------------- | ------------- | ------------- | -|GTX 1060 @ stock -t 3 | v7 Linux openCL | 10x128 | | 2650 nps| -|1080 ti @ 2ghz -t 3 | v7 Windows openCL | 10x128 | | 2500 nps| -|GTX 1050 Ti @ stock | v7 Windows openCL | 20x256 | go infinite | 2300 nps| -|GTX 1050 Ti @ stock | v7 Windows openCL | 20x256 | benchmark | 1690 nps| -|GTX 470 @ stock -t 2 | v7 Windows openCL | 10x128 | | 600 nps| -# CPU -| CPU @ stock or OC frequency| Engine version/type | Neural Net size | Username | Speed | -| ------------- | ---- | ------------- | ------------- |------------- | -|i7-6800K @ 3.6GHz -t 12 | v7 Linux openblas | 10x128 | | 1010 nps| -|i7-8700 stock -t 12 | v7 Windows openblas | 10x128 | | 818 nps| -|i6700 stock -t 4 | v7 Windows intel_mkl | 10x128 | | 500 nps| -|i6700 stock -t 4 | v7 Windows openblas | 10x128 | | 320 nps| -|Ryzen 3 1200 stock -t 4 | v7 Windows openblas | 10x128 | | 300 nps| +|3990x -128th | v0.28.2 | 10x128 | DNNL-BLAS | xxx nps| +|5950x -32th | v0.28.2 | 15x192 | Open-BLAS | xxx nps| +|11900k -16th | v0.28.2 | 15x192 | onednn | xxx nps| +|5600x -12th | v0.28.2 | 10x128 | onednn | xxx nps| + + From bda7ddaeb2f44a815778694859a95a8ae344a730 Mon Sep 17 00:00:00 2001 From: nathan-lc0 <98367568+nathan-lc0@users.noreply.github.com> Date: Sat, 5 Mar 2022 17:34:05 -0600 Subject: [PATCH 2/5] Update Benchmarks.md --- content/dev/wiki/Benchmarks.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/content/dev/wiki/Benchmarks.md b/content/dev/wiki/Benchmarks.md index bd5af56..70cac1f 100644 --- a/content/dev/wiki/Benchmarks.md +++ b/content/dev/wiki/Benchmarks.md @@ -4,17 +4,16 @@ weight: 500 wikiname: "Benchmarks" # Warning: File is automatically generated from GitHub wiki, do not edit by hand. --- -Run `lc0.exe benchmark --nncache=2000000` and report nps output. Please use latest release. - +Run `lc0.exe benchmark --nncache=2000000` and report nps output and binary version, please use latest release or current master. Google docs of bench results here. Easier to maintain/prettier? https://docs.google.com/spreadsheets/d/1i4ymeCO7SH1vQ5gS7ZcBChjQaiv1dNItwXwLqvNC-r4/preview # Ampere Cards | GPU model | Engine version | Neural Net size | Backend | Speed | | ------------- | ---- | ------------- | ------------- | ------------- | |A100 40GB | v0.28.2 | 30x384 | cuda-fp16 | 71560 nps| -|RTX 3090 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| |RTX 3090 | v0.28.0 | 30x384 | cuda-fp16 | xxx nps| -|RTX 3080 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +|RTX 3080 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| +|RTX 3080 | v0.28.2 | 30x384 | cuda-fp16 | 32289 nps| |RTX 3070 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| |RTX 3060 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| |RTX 3060 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| From 25ffc7a5b5cf713e38e4223556b1487c3ebe4f0f Mon Sep 17 00:00:00 2001 From: nathan-lc0 <98367568+nathan-lc0@users.noreply.github.com> Date: Sat, 5 Mar 2022 17:48:21 -0600 Subject: [PATCH 3/5] Update Benchmarks.md --- content/dev/wiki/Benchmarks.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/dev/wiki/Benchmarks.md b/content/dev/wiki/Benchmarks.md index 70cac1f..6c72057 100644 --- a/content/dev/wiki/Benchmarks.md +++ b/content/dev/wiki/Benchmarks.md @@ -11,8 +11,8 @@ Google docs of bench results here. Easier to maintain/prettier? https://docs.goo | GPU model | Engine version | Neural Net size | Backend | Speed | | ------------- | ---- | ------------- | ------------- | ------------- | |A100 40GB | v0.28.2 | 30x384 | cuda-fp16 | 71560 nps| -|RTX 3090 | v0.28.0 | 30x384 | cuda-fp16 | xxx nps| -|RTX 3080 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| +|RTX 3090 | v0.29 Mar 5| 30x384 | cuda-fp16 | xxx nps| +|RTX 3080 | v0.29 | 40x512 | cuda-fp16 | 15159 nps| |RTX 3080 | v0.28.2 | 30x384 | cuda-fp16 | 32289 nps| |RTX 3070 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| |RTX 3060 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| From 4e819c3aa1b6d953df37b268d373ec26632824c9 Mon Sep 17 00:00:00 2001 From: nathan-lc0 <98367568+nathan-lc0@users.noreply.github.com> Date: Sat, 5 Mar 2022 17:49:15 -0600 Subject: [PATCH 4/5] Update Benchmarks.md --- content/dev/wiki/Benchmarks.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/dev/wiki/Benchmarks.md b/content/dev/wiki/Benchmarks.md index 6c72057..ba85c1a 100644 --- a/content/dev/wiki/Benchmarks.md +++ b/content/dev/wiki/Benchmarks.md @@ -11,8 +11,8 @@ Google docs of bench results here. Easier to maintain/prettier? https://docs.goo | GPU model | Engine version | Neural Net size | Backend | Speed | | ------------- | ---- | ------------- | ------------- | ------------- | |A100 40GB | v0.28.2 | 30x384 | cuda-fp16 | 71560 nps| -|RTX 3090 | v0.29 Mar 5| 30x384 | cuda-fp16 | xxx nps| -|RTX 3080 | v0.29 | 40x512 | cuda-fp16 | 15159 nps| +|RTX 3090 | v0.28| 30x384 | cuda-fp16 | xxx nps| +|RTX 3080 | v0.29.0-dev 3/5 | 40x512 | cuda-fp16 | 15159 nps| |RTX 3080 | v0.28.2 | 30x384 | cuda-fp16 | 32289 nps| |RTX 3070 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| |RTX 3060 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| From 1ddf6f133c722b843f32e205f7169c857e041880 Mon Sep 17 00:00:00 2001 From: nathan-lc0 <98367568+nathan-lc0@users.noreply.github.com> Date: Tue, 8 Mar 2022 01:02:40 -0600 Subject: [PATCH 5/5] Update Benchmarks.md --- content/dev/wiki/Benchmarks.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/dev/wiki/Benchmarks.md b/content/dev/wiki/Benchmarks.md index ba85c1a..6d6d37a 100644 --- a/content/dev/wiki/Benchmarks.md +++ b/content/dev/wiki/Benchmarks.md @@ -12,11 +12,11 @@ Google docs of bench results here. Easier to maintain/prettier? https://docs.goo | ------------- | ---- | ------------- | ------------- | ------------- | |A100 40GB | v0.28.2 | 30x384 | cuda-fp16 | 71560 nps| |RTX 3090 | v0.28| 30x384 | cuda-fp16 | xxx nps| -|RTX 3080 | v0.29.0-dev 3/5 | 40x512 | cuda-fp16 | 15159 nps| +|RTX 3080 | v0.29.0-dev 3/3 | 40x512 | cuda-fp16 | 15159 nps| |RTX 3080 | v0.28.2 | 30x384 | cuda-fp16 | 32289 nps| |RTX 3070 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| -|RTX 3060 | v0.28.2 | 40x512 | cuda-fp16 | xxx nps| -|RTX 3060 | v0.28.2 | 30x384 | cuda-fp16 | xxx nps| +|RTX 3060 | v0.29.0-dev 3/3 | 40x512 | cuda-fp16 | 6659 nps| +|RTX 3060 | v0.29.0-dev 3/3 | 30x384 | cuda-fp16 | 14639 nps| # Turing Cards | GPU model | Engine version | Neural Net size | Backend | Speed | | ------------- | ---- | ------------- | ------------- | ------------- |