-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terraform template for different GenAI Examples #427
Comments
What is the end result goal? Are we attempting to make it easy to create a demo, or do we want to make reusable modules that allow others to launch production environments? The reason I ask is that the approach is going to be very different depending on the goal. I think ultimately we want to build reusable modules, but there is a lot that will go into that so I also understand why pushing out simple examples first would make sense. |
The answer is "and" instead of "or". I'd like to have a simple demo that shows a quick run of OPEA on these CSPs. And then eventually modules that others can build on. |
We have a template for AWS, but looking to cover other CSPs and more of the GenAIExamples. https://github.com/intel/optimized-cloud-recipes/tree/main/recipes/ai-opea-chatqna-xeon |
Our Intel team has developed and maintained 31 Terraform Modules and some Ansible Modules. TF Modules: https://github.com/orgs/intel/repositories?q=terraform-&type=all&language=&sort= And as pointed out, we are now showcasing OPEA examples for end-to-end. @wsfowler kicked us off with the first one for AWS that was linked above. The current focus is Azure/AWS/GCP, no plans yet to support other CSPs. I'll work with the team to focus on two new ones:
Do we know which OPEA use cases(page views?) are getting the most attention so we can prioritize? Does this help? Open to discussing if needed(MS Teams). |
By the way we keep them modular for many reasons.
They all work independently, but better together. |
ChatQnA on Amazon, Azure and GCP are a great start. Can the Terraform templates show how to deploy this sample on CSP's k8s service? Are the Terraform templates modular where other samples can be accommodated when ready? Are there any discussions about Ansible script to deploy on OpenShift? |
Cool, thanks for the feedback.
We have modules to deploy EKS, GKE, AKS. In short yes, we would depend and use OPEA helm charts as the integration point. If those are available, it should not be too hard.
For example, the same OPEA ChatQnA Ansible Module will be used on the Azure/GCP Terraform examples.
We have not looked into OpenShift at all. Due to priorities, so far, the focus has been on first party cloud services. |
I would suggest the following priority: ChatQnA on Amazon, Microsoft, Azure |
Thanks for the feedback, let me bring this up with the team. |
So @arun-gupta is suggesting first single node deployments on AWS and Microsoft Azure, then Red Hat OpenShift followed by Kubernetes variants on popular CSPs: Amazon, Google, and Microsoft Azure. |
@mkbhanda just to be clear, single node deployments on AWS, Microsoft and Google using Docker Compose, then Red Hat Open Shift, and K8s distros on hyperscalers in the order mentioned above. |
Yes, for single nodes we will leverage the OPEA docker compose files. |
This is an issue for the OPEA Hackathon, if @lucasmelogithub you are going to do this during October great, if not, can you unassign yourself so we can have someone work on it? |
PR for ChatQnA for AWS and GCP created |
PR for ChatQnA for AWS EKS including persistent volume support for model data and load balancer service type for external consumtion. |
There is a Terraform template for Chat QnA sample at https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-xeon-opea-chatqna. This is only targeted at AWS.
We need a Terraform template for all GenAI examples that can be used to deploy on different cloud providers. Here are the target CSPs:
Each template should require the least amount of configuration, assume reasonable defaults yet configurable, and be able to deploy the entire sample with a public IP address to run the sample.
The text was updated successfully, but these errors were encountered: