Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending multiple shapes binary input data #863

Open
eladamittai opened this issue Apr 20, 2024 · 5 comments
Open

Sending multiple shapes binary input data #863

eladamittai opened this issue Apr 20, 2024 · 5 comments

Comments

@eladamittai
Copy link

eladamittai commented Apr 20, 2024

Hey, I'm using a model with a dynamic shape input with float16 type, and I wanted to test it using grpc, so I have to use binary input data. I was wondering if there is a way to send multiple requests in different shapes like in a json input data but with binary data. Also, if there is a way to send the requests in a certain ratio. Like, sending 16000 shaped requests twice the amount of 32000 shaped requests.

@tgerdesnv
Copy link
Collaborator

tgerdesnv commented Apr 24, 2024

Hi @eladamittai,

Under the hood, Model Analyzer uses Perf Analyzer. You can find documentation for passing in input data here: https://github.com/triton-inference-server/client/blob/main/src/c%2B%2B/perf_analyzer/docs/input_data.md

When using Model Analyzer, any args that you want to be passed on to Perf analyzer would go after a --. For example:
model_analyzer <model analyzer args> -- --input-data /path/to/file

edit: My recollection was wrong. You will need to pass Perf Analyzer args via the perf_analyzer_flags section of the config yaml file. Let me know if you need help with this.

Now then, you had a number of specific asks. I'm trying to wrap my head around if they are possible.

You need binary data:
Is this actually true? I see you mentioning GRPC, but are you sure it doesn't work by supplying normal fp16 data? I believe everything should work under the hood. Perf Analyzer should convert the data as needed before sending it to triton.

If you do need binary data, then there are a few possible options, although I'm not sure they are all compatible with the rest of your asks.

You want different shaped requests
Go here and search for optional "shape". That paragraph and following example show how to provide data with different shapes.

You want a ratio of different shapes
We have stories in our backlog to try to support cases like this, but for now you would need to do it yourself. If you wanted a 2:1 ratio of shape X to shape Y to be sent, then your input data file would need 3 entries: 2 with shape X and 1 with shape Y.

@eladamittai
Copy link
Author

eladamittai commented Apr 24, 2024

Hi @eladamittai,

Under the hood, Model Analyzer uses Perf Analyzer. You can find documentation for passing in input data here: https://github.com/triton-inference-server/client/blob/main/src/c%2B%2B/perf_analyzer/docs/input_data.md

When using Model Analyzer, any args that you want to be passed on to Perf analyzer would go after a --. For example:
model_analyzer <model analyzer args> -- --input-data /path/to/file

edit: My recollection was wrong. You will need to pass Perf Analyzer args via the perf_analyzer_flags section of the config yaml file. Let me know if you need help with this.

Now then, you had a number of specific asks. I'm trying to wrap my head around if they are possible.

You need binary data:
Is this actually true? I see you mentioning GRPC, but are you sure it doesn't work by supplying normal fp16 data? I believe everything should work under the hood. Perf Analyzer should convert the data as needed before sending it to triton.

If you do need binary data, then there are a few possible options, although I'm not sure they are all compatible with the rest of your asks.

You want different shaped requests
Go here and search for optional "shape". That paragraph and following example show how to provide data with different shapes.

You want a ratio of different shapes
We have stories in our backlog to try to support cases like this, but for now you would need to do it yourself. If you wanted a 2:1 ratio of shape X to shape Y to be sent, then your input data file would need 3 entries: 2 with shape X and 1 with shape Y.

Hey, thank you for answering. I checked the perf analyzer documentation, and I managed to send the requests in multiple shapes for a float32 compiled version I have of the model using a json file, but as you can see from this older issue I opened about sending float16 input using json, it's not possible when using grpc. Unless something changed in the later releases of the model/perf analyzer. From your response I didn't understand if I can send multiple shapes using a binary data. I tried to combine the binary dir with the json file, as such:
{ Data: [ Input name: { Content: binary_input_dir Shape: [16000] } ] }
But it didn't work. Is there a different way to send multiple binary input files in multiple shapes?

@eladamittai
Copy link
Author

Hey, is there an answer?

@tgerdesnv
Copy link
Collaborator

Apologies for the delay. I'm looking into this.

@tgerdesnv
Copy link
Collaborator

I believe you can use base64 for binary data. Then you can stick to the normal input_data format and provide shapes.
There is an example on this page, although I can't link directly to it. You'll have to scroll down. I've cut and pasted it here:

{
  "data":
    [
      {
        "INPUT":
          {
            "content": {"b64": "/9j/4AAQSkZ(...)"},
            "shape": [7964]
          }
      },
      {
        "INPUT":
          {
            "content": {"b64": "/9j/4AAQSkZ(...)"},
            "shape": [7964]
          }
      }
    ]
}

Using that as a basis, you could provide 3 inputs, 2 of one shape and 1 of another, to accomplish the goal of a 2:1 ratio of input shapes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants