-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add model card in response for ModelMetadata API #5750
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
@GuanLuo Could you please also review this proposal? Thanks. |
Any chance to get this discussed/reviewed? @dyastremsky |
Sure! Thank you for your contribution. Created a ticket, someone will look into this soon. |
Hi @yeahdongcn, thanks for submitting the PR, it brought the attention to the KServe Open Inference Protocol group that there is a need for extending the model metadata protocol (Triton's ModelMetadata API is an implementation of the protocol). The extension proposal is still being worked on to be a generic solution for providing model properties not only in HuggingFace model card but also in other formats. That being said, I think the change to the existing PR will be minimal, although we will need to wait until the protocol has been relaxed. |
@GuanLuo Thanks for letting me know this. It will be great if they can extend the protocol. Just want to know your thoughts about how to store these metadata, do you prefer to put everything in |
I think in this case (HF model card), it would be better to be stored as a separate file in the model directory and link the relative path in the |
Model Cards is a concept from Hugging Face that accompany the models and provide handy information. Under the hood, model cards are simple Markdown files with additional metadata.
As the Triton server is capable to integrate with various file-system providers such as local, S3, GCS and etc. It becomes easier for extending the model hierarchy to support model cards.
My proposal is to add a
README.md
for each model being served by the Triton server and fromModelMetadata
API call, one can get themodel card
in the response.Use a minimal model repository for a TorchScript model as an example:
Other PRs for supporting this feature:
Testing done:
ModelMetadata
API using Golang client and Python HTTP/gRPC SDK, everything works fine for the models w/woREADME.md
.