OPEA Release Notes v0.8

What’s New in OPEA v0.8

Broaden functionality
- Support frequently asked questions (FAQs) generation GenAI example
- Expand the support of LLMs such as Llama3.1 and Qwen2 and support LVMs such as llava
- Enable end-to-end performance and accuracy benchmarking
- Support the experimental Agent microservice
- Support LLM serving on Ray
Multi-platform support
- Release the Docker images of GenAI components under OPEA dockerhub and support the deployment with Docker
- Support cloud-native deployment through Kubernetes manifests and GenAI Microservices Connector (GMC)
- Enable the experimental authentication and authorization support using JWT tokens
- Validate ChatQnA on multiple platforms such as Xeon, Gaudi, AIPC, Nvidia, and AWS
OPEA Docker Hub: https://hub.docker.com/u/opea

Details

GenAIExamples

ChatQnA
- Add ChatQnA instructions for AIPC(26d4ff)
- Adapt Vllm response format (034541)
- Update tgi version(5f52a1)
- Update README.md(f9312b)
- Udpate ChatQnA docker compose for Dataprep Update(335362)
- [Doc] Add valid micro-service details(e878dc)
- Updates for running ChatQnA + Conversational UI on Gaudi(89ddec)
- Fix win PC issues(ba6541)
- [Doc]Add ChatQnA Flow Chart(97da49)
- Add guardrails in the ChatQnA pipeline(955159)
- Fix a minor bug for chatqna in docker-compose(b46ae8)
- Support vLLM/vLLM-on-Ray/Ray Serve for ChatQnA(631d84)
- Added ChatQnA example using Qdrant retriever(c74564)
- Update TEI version v1.5 for better performance(f4b4ac)
- Update ChatQnA upload feature(598484)
- Add auto truncate for embedding and rerank(8b6094)
Deployment
- Add Kubernetes manifest files for deploying DocSum(831463)
- Update Kubernetes manifest files for CodeGen(2f9397)
- Add Kubernetes manifest files for deploying CodeTrans(c9548d)
- Updated READMEs for kubernetes example pipelines(c37d9c)
- Update all examples yaml files of GMC in GenAIExample(290a74)
- Doc: fix minor issue in GMC doc(d99461)
- README for installing 4 worklods using helm chart(6e797f)
- Update Kubernetes manifest files for deploying ChatQnA(665c46)
- Add new example of SearchQnA for GenAIExample(21b7d1)
- Add new example of Translation for GenAIExample(d0b028)
Other examples
- Update reranking microservice dockerfile path (d7a5b7)
- Update tgi-gaudi version(3505bd)
- Refine README of Examples(f73267)
- Update READMEs(8ad7f3)
- [CodeGen] Add codegen flowchart(377dd2)
- Update audioqna image name(615f0d)
- Add auto-truncate to gaudi tei (8d4209)
- Update visualQnA chinese version(497895)
- Fix Typo for Translation Example(95c13d)
- FAQGen Megaservice(8c4a25)
- Code-gen-react-ui(1b48e5)
- Added doc sum react-ui(edf0d1)
CI/UT
- Frontend failed with unknown timeout issue (7ebe78)
- Adding Chatqna Benchmark Test(11a56e)
- Expand tgi connect timeout(ee0dcb)
- Optimize gmc manifest e2e tests(15fc6f)
- Add docker compose yaml print for test(bb4230)
- Refactor translation ci test (b7975e)
- Refactor searchqna ci test(ecf333)
- Translate UT for UI(284d85)
- Enhancement the codetrans e2e test(450efc)
- Allow gmc e2e workflow to get secrets(f45f50)
- Add checkout ref in gmc e2e workflow(62ae64)
- SearchQnA UT(268d58)

GenAIComps

Cores
- Support https for microservice(2d6772)
- Enlarge megaservice request timeout for supporting high concurrency(876ca5)
- Add dynamic DAG(f2995a)
LLM
- Optional vllm microservice container build(963755)
- Refine vllm instruction(6e2c28)
- Introduce 'entrypoint.sh' for some Containers(9ecc5c)
- Support llamaindex for retrieval microservice and remove langchain(61795f)
- Update tgi with text-generation-inference:2.1.0(f23694)
- Fix requirements(f4b029)
- Add vLLM on Ray microservice(ec3b2e)
- Update code/readme/UT for Ray Serve and VLLM(dd939c)
- Allow the Ollama microservice to be configurable with different models(2458e2)
- LLM performance optimization and code refine(6e31df)
DataPrep
- Support get/delete file in Dataprep Microservice(5d0842)
- Dataprep | PGVector : Added support for new changes in utils.py(54eb7a)
- Enhance the dataprep microservice by adding separators(ef97c2)
- Freeze python-bidi==0.4.2 for dataprep/redis(b4012f)
- Support delete data for Redis vector db(967fdd)
Other Components
- Remove ingest in Retriever MS(d25d2c)
- Qdrant retriever microservice(9b658f)
- Update milvus service for dataprep and retriever(d7cdab)
- Architecture specific args for a few containers(1dd7d4)
- Update driver compatible image(1d4664)
- Fix Llama-Guard-2 issue(6b091c)
- Embeddings: adaptive detect embedding model arguments in mosec(f164f0)
- Architecture specific args for langchain guardrails(5e232a)
- Fix requirements install issue for reranks/fastrag(94e807)
- Update to remove warnings when building Dockerfiles(3e5dd0)
- Initiate Agent component(c3f6b2)
- Add FAQGen gateway in core to support FAQGen Example(9c90eb)
- Prompt registry(f5a548)
- Chat History microservice for chat data persistence(30d95b)
- Align asr output and llm input without using orchestrator(64e042)
- Doc: add missing in README.md codeblock(2792e2)
- Prompt registry(f5a548)
- Chat History microservice for chat data persistence(30d95b)
- Align asr output and llm input without using orchestrator(64e042)
CI/UT
- Fix duplicate ci test(33f37c)
- Build and push new docker images into registry(80da5a)
- Update image build for gaudi(fe3d22)
- Add guardrails ut(556030)

GenAIEvals

Update lm-eval to 0.4.3(89c825)
Add toxicity/bias/hallucination metrics(48015a)
Support stress benchmark test(59cb27)
Add rag related metrics(83ad9c)
Added CRUD Chinese benchmark example(9cc6ca)
Add MultiHop English benchmark accuracy(8aa1e6)

GenAIInfra

GMC
- Enable image build on push for gmc(f8a295)
- Revise workflow to support gmc running in kind(a2dc96)
- Enable GMC system installation on push(af2d0f)
- Enhance the switch mode for GMC router service required(f96b0e)
- Optimize GMC e2e scripts(27a062)
- Optimize app namesapces and fix some typos in gmc e2e test(9c97fa)
- Add GMC into README(b25c0b)
- Gmc: add authN & authZ support on fake JWT token(3756cf)
- GMC: adopt new common/menifests(b18531)
- Add new example of searchQnA on both xeon and gaudi(883c8d)
- Support switch mode in GMC for MI6 team(d11aeb)
- Add translation example into GMC(6235a9)
- Gmc: add authN & authZ support on keycloak(3d139b)
- GMC: Support new component(4c5a51)
- GMC: update README(d57b94)
HelmChart
- Helm chart: change default global.modelUseHostPath value(8ffc3b)
- Helm chart: Add readOnlyRootFilesystem to securityContext(9367a9)
- Update chatqna with additional dependencies(009c96)
- Update codegen with additional dependencies(d41dd2)
- Make endpoints configurable by user(486023)
- Add data prep component(384931)
- The microservice port number is not configurable(fbaa6a)
- Add MAX_INPUT_TOKENS to tgi(2fcbb0)
- Add script to generate yaml files from helm-charts(6bfe31)
- Helm: support adding extra env from external configmap(7dabdf)
- Helm: expose dataprep configurable items into value file(83fc1a)
- Helm: upgrade version to 0.8.0(b3cbde)
- Add whisper and asr components(9def61)
- Add tts and speecht5 components helm chart(9d1465)
- Update the script to generate comp manifest(ab53e9)
- Helm: remove unused Probes(c1cff5)
- Helm: Add tei-gaudi support(a456bf)
- Helm redis-vector-db: Add missings in value file(9e15ef)
- Helm: Use empty string instead of null in value files(6151ac)
- Add component k8s manifest files(68483c)
- Add helm test for chart redis-vector-db(236381)
- Add helm test for chart tgi(9b5def)
- Add helm test for chart tei(f5c7fa)
- Add helm test for chart teirerank(00532a)
- Helm test: Make curl fail if http_status > 400 returned(92c4b5)
- Add helm test for chart embedding-usvc(a98561)
- Add helm test for chart llm-uservice(f4f3ea)
- Add helm test for chart reranking-usvc(397208)
- Add helm test for chart retriever-usvc(6db408)
- Helm: Support automatically install dependency charts(dc90a5)
- Helm: support remove helm dependency(fbdb1d)
- Helm: upgrade tgi chart(c3a1c1)
- Helm/manifest: update tei config for tei-gaudi(88b3c1)
- Add CodeTrans helm chart(5b05f9)
- Helm: Update chatqna to latest(7ff03b)
- Add DocSum helm chart(b56116)
- Add docsum support for helm test(f6354b)
- Helm: Update codegen to latest(419e5b)
- Fix codegen helm chart readme(b4b28e)
- Disable runAsRoot for speecht5 and whisper(aeef78)
- Use upstream tei-gaudi image(e4d3ff)
Others
- Enhancement the e2e test for GenAIInfra for fixing some bugs(602af5)
- Fix bugs for router on handling response from pipeline microservices(ef47f9)
- Improve the examples of codegen and codetrans e2e test(07494c)
- Remove the dependencies of common microservices(f6dd87)
- Add scripts for KubeRay and Ray Cluster(7d3d13)
- Enable CI for common components(9e27a0)
- Disable common component test(e1cd50)
- CI for common: avoid false error in helm test result(876b7a)
- Add the init input for pipeline to keep the parameter information(e25a1f)
- Adjust CI gaudi version(d75d8f)
- Fix CHART_MOUNT and HFTOKEN for CI(10b908)
- Change tgi tag because gaudi driver is upgraded to 1.16.1 (6796ef)
- Update README for new manifests(ec32bf)
- Support multiple router service in one namespace(0ac732)
- Improve workflow trigger conditions to be more precise(ab5c8d)
- Remove unnecessary component DocSumGaudi which would cause error(9b973a)
- Remove chart_test scripts and add script to dump pod status(88caf0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generative AI Infrastructure v0.8 Release Notes

OPEA Release Notes v0.8

What’s New in OPEA v0.8

Details