Skip to content

Generative AI Infrastructure v0.8 Release Notes

Compare
Choose a tag to compare
@kevinintel kevinintel released this 29 Jul 02:21
· 147 commits to main since this release
39c7f46

OPEA Release Notes v0.8

What’s New in OPEA v0.8

  • Broaden functionality

    • Support frequently asked questions (FAQs) generation GenAI example
    • Expand the support of LLMs such as Llama3.1 and Qwen2 and support LVMs such as llava
    • Enable end-to-end performance and accuracy benchmarking
    • Support the experimental Agent microservice
    • Support LLM serving on Ray
  • Multi-platform support

    • Release the Docker images of GenAI components under OPEA dockerhub and support the deployment with Docker
    • Support cloud-native deployment through Kubernetes manifests and GenAI Microservices Connector (GMC)
    • Enable the experimental authentication and authorization support using JWT tokens
    • Validate ChatQnA on multiple platforms such as Xeon, Gaudi, AIPC, Nvidia, and AWS
  • OPEA Docker Hub: https://hub.docker.com/u/opea

Details

GenAIExamples
  • ChatQnA

    • Add ChatQnA instructions for AIPC(26d4ff)
    • Adapt Vllm response format (034541)
    • Update tgi version(5f52a1)
    • Update README.md(f9312b)
    • Udpate ChatQnA docker compose for Dataprep Update(335362)
    • [Doc] Add valid micro-service details(e878dc)
    • Updates for running ChatQnA + Conversational UI on Gaudi(89ddec)
    • Fix win PC issues(ba6541)
    • [Doc]Add ChatQnA Flow Chart(97da49)
    • Add guardrails in the ChatQnA pipeline(955159)
    • Fix a minor bug for chatqna in docker-compose(b46ae8)
    • Support vLLM/vLLM-on-Ray/Ray Serve for ChatQnA(631d84)
    • Added ChatQnA example using Qdrant retriever(c74564)
    • Update TEI version v1.5 for better performance(f4b4ac)
    • Update ChatQnA upload feature(598484)
    • Add auto truncate for embedding and rerank(8b6094)
  • Deployment

    • Add Kubernetes manifest files for deploying DocSum(831463)
    • Update Kubernetes manifest files for CodeGen(2f9397)
    • Add Kubernetes manifest files for deploying CodeTrans(c9548d)
    • Updated READMEs for kubernetes example pipelines(c37d9c)
    • Update all examples yaml files of GMC in GenAIExample(290a74)
    • Doc: fix minor issue in GMC doc(d99461)
    • README for installing 4 worklods using helm chart(6e797f)
    • Update Kubernetes manifest files for deploying ChatQnA(665c46)
    • Add new example of SearchQnA for GenAIExample(21b7d1)
    • Add new example of Translation for GenAIExample(d0b028)
  • Other examples

    • Update reranking microservice dockerfile path (d7a5b7)
    • Update tgi-gaudi version(3505bd)
    • Refine README of Examples(f73267)
    • Update READMEs(8ad7f3)
    • [CodeGen] Add codegen flowchart(377dd2)
    • Update audioqna image name(615f0d)
    • Add auto-truncate to gaudi tei (8d4209)
    • Update visualQnA chinese version(497895)
    • Fix Typo for Translation Example(95c13d)
    • FAQGen Megaservice(8c4a25)
    • Code-gen-react-ui(1b48e5)
    • Added doc sum react-ui(edf0d1)
  • CI/UT

    • Frontend failed with unknown timeout issue (7ebe78)
    • Adding Chatqna Benchmark Test(11a56e)
    • Expand tgi connect timeout(ee0dcb)
    • Optimize gmc manifest e2e tests(15fc6f)
    • Add docker compose yaml print for test(bb4230)
    • Refactor translation ci test (b7975e)
    • Refactor searchqna ci test(ecf333)
    • Translate UT for UI(284d85)
    • Enhancement the codetrans e2e test(450efc)
    • Allow gmc e2e workflow to get secrets(f45f50)
    • Add checkout ref in gmc e2e workflow(62ae64)
    • SearchQnA UT(268d58)
GenAIComps
  • Cores

    • Support https for microservice(2d6772)
    • Enlarge megaservice request timeout for supporting high concurrency(876ca5)
    • Add dynamic DAG(f2995a)
  • LLM

    • Optional vllm microservice container build(963755)
    • Refine vllm instruction(6e2c28)
    • Introduce 'entrypoint.sh' for some Containers(9ecc5c)
    • Support llamaindex for retrieval microservice and remove langchain(61795f)
    • Update tgi with text-generation-inference:2.1.0(f23694)
    • Fix requirements(f4b029)
    • Add vLLM on Ray microservice(ec3b2e)
    • Update code/readme/UT for Ray Serve and VLLM(dd939c)
    • Allow the Ollama microservice to be configurable with different models(2458e2)
    • LLM performance optimization and code refine(6e31df)
  • DataPrep

    • Support get/delete file in Dataprep Microservice(5d0842)
    • Dataprep | PGVector : Added support for new changes in utils.py(54eb7a)
    • Enhance the dataprep microservice by adding separators(ef97c2)
    • Freeze python-bidi==0.4.2 for dataprep/redis(b4012f)
    • Support delete data for Redis vector db(967fdd)
  • Other Components

    • Remove ingest in Retriever MS(d25d2c)
    • Qdrant retriever microservice(9b658f)
    • Update milvus service for dataprep and retriever(d7cdab)
    • Architecture specific args for a few containers(1dd7d4)
    • Update driver compatible image(1d4664)
    • Fix Llama-Guard-2 issue(6b091c)
    • Embeddings: adaptive detect embedding model arguments in mosec(f164f0)
    • Architecture specific args for langchain guardrails(5e232a)
    • Fix requirements install issue for reranks/fastrag(94e807)
    • Update to remove warnings when building Dockerfiles(3e5dd0)
    • Initiate Agent component(c3f6b2)
    • Add FAQGen gateway in core to support FAQGen Example(9c90eb)
    • Prompt registry(f5a548)
    • Chat History microservice for chat data persistence(30d95b)
    • Align asr output and llm input without using orchestrator(64e042)
    • Doc: add missing in README.md codeblock(2792e2)
    • Prompt registry(f5a548)
    • Chat History microservice for chat data persistence(30d95b)
    • Align asr output and llm input without using orchestrator(64e042)
  • CI/UT

    • Fix duplicate ci test(33f37c)
    • Build and push new docker images into registry(80da5a)
    • Update image build for gaudi(fe3d22)
    • Add guardrails ut(556030)
GenAIEvals
  • Update lm-eval to 0.4.3(89c825)
  • Add toxicity/bias/hallucination metrics(48015a)
  • Support stress benchmark test(59cb27)
  • Add rag related metrics(83ad9c)
  • Added CRUD Chinese benchmark example(9cc6ca)
  • Add MultiHop English benchmark accuracy(8aa1e6)
GenAIInfra
  • GMC

    • Enable image build on push for gmc(f8a295)
    • Revise workflow to support gmc running in kind(a2dc96)
    • Enable GMC system installation on push(af2d0f)
    • Enhance the switch mode for GMC router service required(f96b0e)
    • Optimize GMC e2e scripts(27a062)
    • Optimize app namesapces and fix some typos in gmc e2e test(9c97fa)
    • Add GMC into README(b25c0b)
    • Gmc: add authN & authZ support on fake JWT token(3756cf)
    • GMC: adopt new common/menifests(b18531)
    • Add new example of searchQnA on both xeon and gaudi(883c8d)
    • Support switch mode in GMC for MI6 team(d11aeb)
    • Add translation example into GMC(6235a9)
    • Gmc: add authN & authZ support on keycloak(3d139b)
    • GMC: Support new component(4c5a51)
    • GMC: update README(d57b94)
  • HelmChart

    • Helm chart: change default global.modelUseHostPath value(8ffc3b)
    • Helm chart: Add readOnlyRootFilesystem to securityContext(9367a9)
    • Update chatqna with additional dependencies(009c96)
    • Update codegen with additional dependencies(d41dd2)
    • Make endpoints configurable by user(486023)
    • Add data prep component(384931)
    • The microservice port number is not configurable(fbaa6a)
    • Add MAX_INPUT_TOKENS to tgi(2fcbb0)
    • Add script to generate yaml files from helm-charts(6bfe31)
    • Helm: support adding extra env from external configmap(7dabdf)
    • Helm: expose dataprep configurable items into value file(83fc1a)
    • Helm: upgrade version to 0.8.0(b3cbde)
    • Add whisper and asr components(9def61)
    • Add tts and speecht5 components helm chart(9d1465)
    • Update the script to generate comp manifest(ab53e9)
    • Helm: remove unused Probes(c1cff5)
    • Helm: Add tei-gaudi support(a456bf)
    • Helm redis-vector-db: Add missings in value file(9e15ef)
    • Helm: Use empty string instead of null in value files(6151ac)
    • Add component k8s manifest files(68483c)
    • Add helm test for chart redis-vector-db(236381)
    • Add helm test for chart tgi(9b5def)
    • Add helm test for chart tei(f5c7fa)
    • Add helm test for chart teirerank(00532a)
    • Helm test: Make curl fail if http_status > 400 returned(92c4b5)
    • Add helm test for chart embedding-usvc(a98561)
    • Add helm test for chart llm-uservice(f4f3ea)
    • Add helm test for chart reranking-usvc(397208)
    • Add helm test for chart retriever-usvc(6db408)
    • Helm: Support automatically install dependency charts(dc90a5)
    • Helm: support remove helm dependency(fbdb1d)
    • Helm: upgrade tgi chart(c3a1c1)
    • Helm/manifest: update tei config for tei-gaudi(88b3c1)
    • Add CodeTrans helm chart(5b05f9)
    • Helm: Update chatqna to latest(7ff03b)
    • Add DocSum helm chart(b56116)
    • Add docsum support for helm test(f6354b)
    • Helm: Update codegen to latest(419e5b)
    • Fix codegen helm chart readme(b4b28e)
    • Disable runAsRoot for speecht5 and whisper(aeef78)
    • Use upstream tei-gaudi image(e4d3ff)
  • Others

    • Enhancement the e2e test for GenAIInfra for fixing some bugs(602af5)
    • Fix bugs for router on handling response from pipeline microservices(ef47f9)
    • Improve the examples of codegen and codetrans e2e test(07494c)
    • Remove the dependencies of common microservices(f6dd87)
    • Add scripts for KubeRay and Ray Cluster(7d3d13)
    • Enable CI for common components(9e27a0)
    • Disable common component test(e1cd50)
    • CI for common: avoid false error in helm test result(876b7a)
    • Add the init input for pipeline to keep the parameter information(e25a1f)
    • Adjust CI gaudi version(d75d8f)
    • Fix CHART_MOUNT and HFTOKEN for CI(10b908)
    • Change tgi tag because gaudi driver is upgraded to 1.16.1 (6796ef)
    • Update README for new manifests(ec32bf)
    • Support multiple router service in one namespace(0ac732)
    • Improve workflow trigger conditions to be more precise(ab5c8d)
    • Remove unnecessary component DocSumGaudi which would cause error(9b973a)
    • Remove chart_test scripts and add script to dump pod status(88caf0)