Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Implement Health Check Endpoint for Delayed Service Startup #764

Open
isaacncz opened this issue Oct 7, 2024 · 7 comments
Open
Assignees
Labels
aitce DEV features enhancement New feature or request

Comments

@isaacncz
Copy link

isaacncz commented Oct 7, 2024

OS type
Ubuntu

Description
When running the example Translation using Docker Compose, one of the images takes additional time to pull a model from the Huggingface upon startup. During this period, the service is unresponsive to HTTP requests, resulting in HTTP 500 errors.

To improve reliability, would like to propose adding a health check endpoint that can verify when the service is ready to handle requests. This will allow other services and users to know when the service is up and running, avoiding unnecessary errors and improving the user experience.

Expected Behavior:

Docker Compose starts all services.
The service in question takes some time to pull the model.
A health check endpoint will be available to verify when the model has finished loading and the service is ready.
Proposed Solution:

Add a /health endpoint that returns:
200 OK when the service is fully operational.
503 Service Unavailable or similar status when the service is still initializing or loading the model.
Optionally, provide a message or a status code that indicates the estimated time remaining for startup.

@louie-tsai
Copy link
Collaborator

@louie-tsai louie-tsai added the enhancement New feature or request label Oct 7, 2024
@louie-tsai
Copy link
Collaborator

@isaacncz
I put one of the example for health check below.
health check
curl http://localhost:3007/v1/health_check -X GET -H 'Content-Type: application/json'
response from microservice
{"Service Title":"opea_service@llm_tgi/MicroService","Service Description":"OPEA Microservice Infrastructure"}

Khangf added a commit to Khangf/GenAIComps that referenced this issue Oct 9, 2024
…pea-project#764

Signed-off-by: Foong, Khang Sheong <khang.sheong.foong@intel.com>
@louie-tsai
Copy link
Collaborator

@isaacncz
Do you have further questions?
if not, we will close the ticket.

@isaacncz
Copy link
Author

@louie-tsai i have tested the health check, it worked. However, for llm microservice, i will not be able to check whether the model is already downloaded completely.

@louie-tsai
Copy link
Collaborator

@isaacncz
no check for model download completion yet indeed.
@kevinintel
There is need to show the LLM model download completion status via health_check or statistics.
please help to evaluate the feature.

@kevinintel
Copy link
Collaborator

It depends on serving framework, we only know the service ready or not

@louie-tsai
Copy link
Collaborator

@kevinintel
will let you handle this feature request which is asking about serving framework readiness.

@louie-tsai louie-tsai removed their assignment Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aitce DEV features enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants