// MIT License // Modifications Copyright (c) 2024 Advanced Micro Devices, Inc. // Permission is hereby granted, free of charge, to any person obtaining a copy // of this software and associated documentation files (the "Software"), to deal // in the Software without restriction, including without limitation the rights // to use, copy, modify, merge, publish, distribute, sublicense, and/or sell // copies of the Software, and to permit persons to whom the Software is // furnished to do so, subject to the following conditions: // The above copyright notice and this permission notice shall be included in // all copies or substantial portions of the Software. // THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR // IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, // FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE // AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER // LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, // OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN // THE SOFTWARE.

Queries

--list, -l
- List all devices and benchmarks without running them.
--help, -h
- Print usage information and exit.
--help-axes, --help-axis
- Print axis specification documentation and exit.
--version
- Print information about the version of NVBench used to build the executable.

Device Modification

--persistence-mode <state>, --pm <state>
- Sets persistence mode for one or more GPU devices.
- Applies to the devices described by the most recent --devices option, or all devices if --devices is not specified.
- This option requires root / admin permissions.
- This option is only supported on Linux.
- This call must precede all other device modification options, if any.
- Note that persistence mode is deprecated and will be removed at some point in favor of the new persistence daemon. See the following link for more details: https://docs.nvidia.com/deploy/driver-persistence/index.html
- Valid values for state are:
  - 0: Disable persistence mode.
  - 1: Enable persistence mode.
--lock-gpu-clocks <rate>, --lgc <rate>
- Lock GPU clocks for one or more devices to a particular rate.
- Applies to the devices described by the most recent --devices option, or all devices if --devices is not specified.
- This option requires root / admin permissions.
- This option is only supported in Volta+ (sm_70+) devices.
- Valid values for rate are:
  - reset, unlock, none: Unlock the GPU clocks.
  - base, tdp: Lock clocks to base frequency (best for stable results).
  - max, maximum: Lock clocks to max frequency (best for fastest results).

Output

--csv <filename/stream>
- Write CSV output to a file, or "stdout" / "stderr".
--json <filename/stream>
- Write JSON output to a file, or "stdout" / "stderr".
--markdown <filename/stream>, --md <filename/stream>
- Write markdown output to a file, or "stdout" / "stderr".
- Markdown is written to "stdout" by default.
--quiet, -q
- Suppress output.
--color
- Use color in output (markdown + stdout only).

Benchmark / Axis Specification

--benchmark <benchmark name/index>, -b <benchmark name/index>
- Execute a specific benchmark.
- Argument is a benchmark name or index, taken from --list.
- If not specified, all benchmarks will run.
- --benchmark may be specified multiple times to run several benchmarks.
- The same benchmark may be specified multiple times with different configurations.
--axis <axis specification>, -a <axis specification>
- Override an axis specification.
- See --help-axis for details on axis specifications.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.

Benchmark Properties

--devices <device ids>, --device <device ids>, -d <device ids>
- Limit execution to one or more devices.
- <device ids> is a single id, a comma separated list, or the string "all".
- Device ids can be obtained from --list.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--min-samples <count>
- Gather at least <count> samples per measurement.
- Default is 10 samples.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--min-time <seconds>
- Accumulate at least <seconds> of execution time per measurement.
- Default is 0.5 seconds.
- If both GPU and CPU times are gathered, this applies to GPU time only.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--max-noise <value>
- Gather samples until the error in the measurement drops below <value>.
- Noise is specified as the percent relative standard deviation.
- Default is 0.5% (--max-noise 0.5)
- Only applies to Cold measurements.
- If both GPU and CPU times are gathered, this applies to GPU noise only.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--skip-time <seconds>
- Skip a measurement when a warmup run executes in less than <seconds>.
- Default is -1 seconds (disabled).
- Intended for testing / debugging only.
- Very fast kernels (<5us) often require an extremely large number of samples to converge max-noise. This option allows them to be skipped to save time during testing.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--timeout <seconds>
- Measurements will timeout after <seconds> have elapsed.
- Default is 15 seconds.
- <seconds> is walltime, not accumulated sample time.
- If a measurement times out, the default markdown log will print a warning to report any outstanding termination criteria (min samples, min time, max noise).
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--run-once
- Only run the benchmark once, skipping any warmup runs and batched measurements.
- Intended for use with external profiling tools.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--disable-blocking-kernel
- Don't use the blocking_kernel.
- Intended for use with external profiling tools.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.
--profile
- Implies --run-once and --disable-blocking-kernel.
- Intended for use with external profiling tools.
- Applies to the most recent --benchmark, or all benchmarks if specified before any --benchmark arguments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli_help.md

cli_help.md

Queries

Device Modification

Output

Benchmark / Axis Specification

Benchmark Properties

Files

cli_help.md

Latest commit

History

cli_help.md

File metadata and controls

Queries

Device Modification

Output

Benchmark / Axis Specification

Benchmark Properties