N/A : Benchmark not present in a round
X: Change in benchmark. Submission results can be compared across rounds when there has been no change in the benchmark
Model |
0.5 |
0.6 |
0.7 |
1.0 |
1.1 |
2.0 |
2.1 |
3.0 |
3.1 |
4.0 |
ResNet-50 v1.5 |
X |
X |
||||||||
SSD-ResNet34 |
X |
X |
N/A |
|||||||
RetinaNet-ResNeXt50 |
N/A |
X |
||||||||
MaskRCNN |
X |
X |
N/A |
|||||||
NCF |
X |
N/A |
||||||||
NMT |
X |
X |
N/A |
|||||||
Transformer |
X |
X |
N/A |
|||||||
MiniGo |
X |
X |
X |
N/A |
||||||
DLRM |
N/A |
X |
N/A |
|||||||
DLRM-dcnv2 |
N/A |
X |
||||||||
BERT |
N/A |
X |
||||||||
RNN-T |
N/A |
X |
X |
N/A |
||||||
3D U-Net |
N/A |
X |
||||||||
GPT3 |
N/A |
X |
||||||||
LLama70B-LoRA |
N/A |
X |
||||||||
RGAT |
N/A |
X |
Metric: Time-to-train (measured in minutes)
Note: v0.6 ResNet-50 v1.5, SSD-ResNet34, NMT increased accuracy targets, all v0.6 benchmarks changed initializition timing, and v0.7 MiniGo moved to 19x19 board
Model |
0.7 |
1.0 |
2.0 |
CosmoFlow |
X |
X |
X |
DeepCAM |
X |
X |
|
Open Catalyst |
N/A |
X |
X |
Metrics: Time-to-train (measured in minutes) and throughput (weak scaling - measured in models/minute)
Model |
0.5 |
0.7 |
1.0 |
1.1 |
2.0 |
2.1 |
3.0 |
3.1 |
MobileNet-v1 |
X |
N/A |
||||||
ResNet-50 v1.5 |
X |
|||||||
SSD-MobileNets |
X |
|||||||
SSD-ResNet34 |
X |
N/A |
||||||
RetinaNet-ResNeXt50 |
N/A |
X |
||||||
NMT |
X |
N/A |
||||||
DLRM |
N/A |
X |
N/A |
|||||
DLRM-v2 |
N/A |
X |
||||||
BERT |
N/A |
X |
||||||
RNN-T |
N/A |
X |
||||||
3D U-Net |
N/A |
X |
||||||
GPT-J |
N/A |
X |
Metrics: Queries/second (server), Samples/second (offline), Latency (measured in milliseconds) (single stream), Streams (multi-stream v0.5-v1.1), Latency (measured in milliseconds) (multi-stream 2.0+)
Additional power metrics: System power (measured in watts) (server and offline), system energy per stream (measured in joules) (single stream and multi-stream)
Note: Performance metrics for inference and power submissions are not comparable
Note: Multistream v0.5-v1.1 is not compatible with v2.0 and newer
Note: Inference over Network scenario introduced in v2.1
Model |
0.7 |
1.0 |
1.1 |
2.0 |
2.1 |
3.0 |
MobileNetEdge |
X |
|||||
SSD-MobileNetsV2 |
X |
N/A |
||||
MobileDET |
N/A |
X |
||||
DeeplabV3 |
X |
N/A |
||||
MOSAIC |
N/A |
X |
||||
MobileBERT |
X |
|||||
EDSR |
N/A |
X |
Primary metrics: Latency (measured in milliseconds) (single stream), Samples/second (offline)
Note: Submission requires all benchmarks in single stream and MobileNetEdge in single stream and offline
Model |
0.5 |
0.7 |
1.0 |
MobileNetV1 |
X |
X |
|
ResNet-V1 |
X* |
X |
|
DSCNN |
X |
X |
|
FC Autoencoder |
X |
X |
Primary metric: Latency (measured in milliseconds)
Secondary metric: Energy per inference (measured in microjoules)
*Latency Compatible, not accuracy: v0.5 and v0.7 use the same model, but changed the evaluation set to improve balance.