-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
publish latest
- Loading branch information
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# RAG Agent | ||
|
||
This agent is specifically designed to improve answer quality over conventional RAG. | ||
This agent strategy includes steps listed below: | ||
|
||
1. QueryWriter | ||
This is an llm with tool calling capability, it decides if tool calls are needed to answer the user query or it can answer with llm's parametric knowledge. | ||
|
||
- Yes: Rephrase the query in the form of a tool call to the Retriever tool, and send the rephrased query to 'Retriever'. The rephrasing is important as user queries may be not be clear and simply using user query may not retrieve relevant documents. | ||
- No: Complete the query with Final answer | ||
|
||
2. Retriever: | ||
|
||
- Get related documents from a retrieval tool, then send the documents to 'DocumentGrader'. Note: The retrieval tool here is broad-sense, which can be a text retriever over a proprietary knowledge base, a websearch API, knowledge graph API, SQL database API etc. | ||
|
||
3. DocumentGrader | ||
Judge retrieved info relevance with respect to the user query | ||
|
||
- Yes: Go to TextGenerator | ||
- No: Go back to QueryWriter to rewrite query. | ||
|
||
4. TextGenerator | ||
- Generate an answer based on query and last retrieved context. | ||
- After generation, go to END. | ||
|
||
Note: | ||
|
||
- Currently the performance of this RAG agent has been tested and validated with only one retrieval tool. If you want to use multiple retrieval tools, we recommend a hierarchical multi-agent system where a supervisor agent dispatches requests to multiple worker RAG agents, where individual worker RAG agents uses one type of retrieval tool. | ||
- The max number of retrieves is set at 3. | ||
- You can specify a small `recursion_limit` to stop early or a big `recursion_limit` to fully use the 3 retrieves. | ||
- The TextGenerator only looks at the last retrieved docs. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
# Telemetry for OPEA | ||
|
||
OPEA Comps currently provides telemetry functionalities for metrics and tracing using Prometheus, Grafana, and Jaeger. Here’s a basic introduction to these tools: | ||
|
||
![opea telemetry](https://raw.githubusercontent.com/Spycsh/assets/main/OPEA%20Telemetry.jpg) | ||
|
||
## Metrics | ||
|
||
OPEA microservice metrics are exported in Prometheus format and are divided into two categories: general metrics and specific metrics. | ||
|
||
General metrics, such as `http_requests_total `, `http_request_size_bytes`, are exposed by every microservice endpoint using the [prometheus-fastapi-instrumentator](https://github.com/trallnag/prometheus-fastapi-instrumentator). | ||
|
||
Specific metrics are the built-in metrics exposed under `/metrics` by each specific microservices such as TGI, vLLM, TEI and others. Both types of the metrics adhere to the Prometheus format. | ||
|
||
### General Metrics | ||
|
||
To access the general metrics of each microservice, you can use `curl` as follows: | ||
|
||
```bash | ||
curl localhost:{port of your service}/metrics | ||
``` | ||
|
||
Then you will see Prometheus format metrics printed out as follows: | ||
|
||
```yaml | ||
HELP http_requests_total Total number of requests by method, status and handler. | ||
# TYPE http_requests_total counter | ||
http_requests_total{handler="/metrics",method="GET",status="2xx"} 3.0 | ||
http_requests_total{handler="/v1/chatqna",method="POST",status="2xx"} 2.0 | ||
... | ||
# HELP http_request_size_bytes Content length of incoming requests by handler. Only value of header is respected. Otherwise ignored. No percentile calculated. | ||
# TYPE http_request_size_bytes summary | ||
http_request_size_bytes_count{handler="/metrics"} 3.0 | ||
http_request_size_bytes_sum{handler="/metrics"} 0.0 | ||
http_request_size_bytes_count{handler="/v1/chatqna"} 2.0 | ||
http_request_size_bytes_sum{handler="/v1/chatqna"} 128.0 | ||
... | ||
``` | ||
|
||
### Specific Metrics | ||
|
||
To access the metrics exposed by each specific microservice, ensure that you check the specific port and your port mapping to reach the `/metrics` endpoint correctly. | ||
|
||
For example, you can `curl localhost:6006/metrics` to retrieve the TEI embedding metrics, and the output should look like follows: | ||
|
||
```yaml | ||
# TYPE te_embed_count counter | ||
te_embed_count 7 | ||
|
||
# TYPE te_request_success counter | ||
te_request_success{method="batch"} 2 | ||
|
||
# TYPE te_request_count counter | ||
te_request_count{method="single"} 2 | ||
te_request_count{method="batch"} 2 | ||
|
||
# TYPE te_embed_success counter | ||
te_embed_success 7 | ||
|
||
# TYPE te_queue_size gauge | ||
te_queue_size 0 | ||
|
||
# TYPE te_request_inference_duration histogram | ||
te_request_inference_duration_bucket{le="0.000015000000000000002"} 0 | ||
te_request_inference_duration_bucket{le="0.000022500000000000005"} 0 | ||
te_request_inference_duration_bucket{le="0.00003375000000000001"} 0 | ||
``` | ||
|
||
These metrics can be scraped by the Prometheus server into a time-series database and further visualized using Grafana. | ||
|
||
Below are some default metrics endpoints for specific microservices: | ||
|
||
| component | port | endpoint | metircs doc | | ||
| ------------- | ----- | -------- | ------------------------------------------------------------------------------------------------------- | | ||
| TGI | 80 | /metrics | [link](https://huggingface.co/docs/text-generation-inference/en/basic_tutorials/monitoring) | | ||
| milvus | 9091 | /metrics | [link](https://milvus.io/docs/monitor.md) | | ||
| vLLM | 18688 | /metrics | [link](https://docs.vllm.ai/en/v0.5.0/serving/metrics.html) | | ||
| TEI embedding | 6006 | /metrics | [link](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/metrics) | | ||
| TEI reranking | 8808 | /metrics | [link](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/metrics) | | ||
|
||
## Tracing | ||
|
||
OPEA use OpenTelemetry to trace function call stacks. To trace a function, add the `@opea_telemetry` decorator to either an async or sync function. The call stacks and time span data will be exported by OpenTelemetry. You can use Jaeger UI to visualize this tracing data. | ||
|
||
By default, tracing data is exported to `http://localhost:4318/v1/traces`. This endpoint can be customized by editing the `TELEMETRY_ENDPOINT` environment variable. | ||
|
||
```py | ||
from comps import opea_telemetry | ||
|
||
|
||
@opea_telemetry | ||
async def your_async_func(): | ||
pass | ||
|
||
|
||
@opea_telemetry | ||
def your_sync_func(): | ||
pass | ||
``` | ||
|
||
## Visualization | ||
|
||
### Visualize metrics | ||
|
||
Please refer to [OPEA grafana](https://github.com/opea-project/GenAIEval/tree/main/evals/benchmark/grafana) to get the details of Prometheus and Grafana server setup. The Grafana dashboard JSON files are also provided under [OPEA grafana](https://github.com/opea-project/GenAIEval/tree/main/evals/benchmark/grafana) to visualize the metrics. | ||
|
||
### Visualize tracing | ||
|
||
Run the following command to start the Jaeger server. | ||
|
||
```bash | ||
docker run -d --rm \ | ||
-e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \ | ||
-p 16686:16686 \ | ||
-p 4317:4317 \ | ||
-p 4318:4318 \ | ||
-p 9411:9411 \ | ||
jaegertracing/all-in-one:latest | ||
``` | ||
|
||
Access the dashboard UI at `localhost:16686`. |