How can I use dynamic batch? #5401
-
How can I use dynamic batch? I even went so far as to load the densenet onnx example model. I want to use dynamic batch for this model. The way I did it is by editing the config file. But it didn't load in the normal way. I wish there was an example code that even beginners can do. triton server docker command # trion version == 22.12
docker run --gpus=all --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -it -p8012:8012 -p8013:8013 -p8014:8014 -v ${PWD}/model_repository:/models triton_server:inference_server tritonserver --model-repository=models --http-port 8012 --log-verbose 1 --allow-http True --strict-readiness True --allow-grpc True --grpc-port 8013 --allow-metrics True --allow-gpu-metrics True --metrics-port 8014 config.pbtxt
|
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
There's a pull request currently to create a conceptual walkthrough to go into more detail about dynamic batching here. The dynamic batcher documentation is here. Your config looks correct. If you're getting an error, it's likely because that model does not support batching. You need a model that supports batching. By enabling dynamic batching here, you add an extra dimension before the others, which the model is not expecting. When creating a model for use with server-side batching, you want the first dimension to be the batch dimension. Depending on what kind of model you want to create, you can see some example model generation scripts we use for tests in this folder. Many of them have batching and non-batching variants, like this function. |
Beta Was this translation helpful? Give feedback.
-
thank you for the reply. @dyastremsky |
Beta Was this translation helpful? Give feedback.
-
The easiest way would be to use perf analyzer. You could also do some testing similar to what we do in the test L0_batcher, if you want to do your own detailed validation. |
Beta Was this translation helpful? Give feedback.
-
Thank you for always responding kindly. I tried perf analyzer. But it seems I'm doing it wrong. The difference between dynamic batch models and non-dynamic batch models. Question 2 Question 3 It seems I'm not understanding correctly. # non-batch_model perf_analyzer command
perf_analyzer -m densenet_onnx -u localhost:8013 -i grpc --concurrency-range 1000:1005 -f non-batch_model.csv
# batch_model perf_analyzer command batch 1
perf_analyzer -m densenet_onnx_batch -u localhost:8013 -i grpc --concurrency-range 1000:1005 -f batch_model.csv
# batch_model perf_analyzer command batch 8
perf_analyzer -m densenet_onnx_batch -u localhost:8013 -i grpc --concurrency-range 1000:1005 -f result.csv -b 8 result
|
Beta Was this translation helpful? Give feedback.
-
These results don't make sense to me. You shouldn't be seeing much of a difference between batch size 1 and non-batch, yet you're batch size 1 has greater than 3x higher throughput. Your batch size 8 should typically perform better than batch size 1, assuming you're not exhausting your system resources (using all your CPU/GPU/etc.). Can you run Otherwise, we need you to provide the models so that we could see if we can reproduce the issues on our side. Depending on what you mean by auto scaling, you'll likely need to incorporate outside tools like Kubernetes to do so. A couple blog posts about doing so: |
Beta Was this translation helpful? Give feedback.
There's a pull request currently to create a conceptual walkthrough to go into more detail about dynamic batching here.
The dynamic batcher documentation is here. Your config looks correct. If you're getting an error, it's likely because that model does not support batching. You need a model that supports batching. By enabling dynamic batching here, you add an extra dimension before the others, which the model is not expecting. When creating a model for use with server-side batching, you want the first dimension to be the batch dimension.
Depending on what kind of model you want to create, you can see some example model generation scripts we use for tests in this folder. Many of them have…