Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model card in response for ModelMetadata API #5750

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yeahdongcn
Copy link
Contributor

@yeahdongcn yeahdongcn commented May 6, 2023

Model Cards is a concept from Hugging Face that accompany the models and provide handy information. Under the hood, model cards are simple Markdown files with additional metadata.

As the Triton server is capable to integrate with various file-system providers such as local, S3, GCS and etc. It becomes easier for extending the model hierarchy to support model cards.

My proposal is to add a README.md for each model being served by the Triton server and from ModelMetadata API call, one can get the model card in the response.

Use a minimal model repository for a TorchScript model as an example:

  <model-repository-path>/
    <model-name>/
      config.pbtxt
      README.md
      1/
        model.pt

Other PRs for supporting this feature:

Testing done:

  • Build pass
  • Invoke ModelMetadata API using Golang client and Python HTTP/gRPC SDK, everything works fine for the models w/wo README.md.

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
@yeahdongcn
Copy link
Contributor Author

@GuanLuo Could you please also review this proposal? Thanks.

@yeahdongcn
Copy link
Contributor Author

Any chance to get this discussed/reviewed? @dyastremsky

@dyastremsky
Copy link
Contributor

Sure! Thank you for your contribution. Created a ticket, someone will look into this soon.

@yeahdongcn
Copy link
Contributor Author

We are trying to turn the Triton server into an on-premises HuggingFace by building a web UI and a small aggregation server. This is still in the early stage and this model card (README.md) is optional in adding any models.

By leveraging the well-structured API implementation, we can easily fetch the contents of README.md and take further processes in the aggregation server.

Here are 2 screenshots of our hub:

models model

@GuanLuo
Copy link
Contributor

GuanLuo commented Jul 19, 2023

Hi @yeahdongcn, thanks for submitting the PR, it brought the attention to the KServe Open Inference Protocol group that there is a need for extending the model metadata protocol (Triton's ModelMetadata API is an implementation of the protocol).

The extension proposal is still being worked on to be a generic solution for providing model properties not only in HuggingFace model card but also in other formats. That being said, I think the change to the existing PR will be minimal, although we will need to wait until the protocol has been relaxed.

@yeahdongcn
Copy link
Contributor Author

yeahdongcn commented Jul 19, 2023

@GuanLuo Thanks for letting me know this. It will be great if they can extend the protocol.

Just want to know your thoughts about how to store these metadata, do you prefer to put everything in config.pbtxt?

@GuanLuo
Copy link
Contributor

GuanLuo commented Jul 20, 2023

I think in this case (HF model card), it would be better to be stored as a separate file in the model directory and link the relative path in the config.pbtxt (i.e. parameters [ {key: "HuggingFaceModelCard", value: "README.md"}]), and when returned from model meta, Triton will read the file content and put it into the response. Which is basically the change you made in the core PR, but the file name will be read from config instead of fixed string (README.md)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants