Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instance Scheduler should allow for instance type flexibility #501

Open
playphil opened this issue Dec 13, 2023 · 10 comments
Open

Instance Scheduler should allow for instance type flexibility #501

playphil opened this issue Dec 13, 2023 · 10 comments

Comments

@playphil
Copy link

Is your feature request related to a problem? Please describe.
Sometimes for a single specified instance type there is not enough capacity in the AZ. This results in an Insufficient Capacity Error (ICE). This occurs more frequently for peculiar types. May also occur during periods of high demand. If there is impact in another AZ or other region, there becomes a thundering herd of many requests to launch instances so demand may exceed available capacity. To avoid ICE's it's advisable for automation to be flexible as to which AZ and also flexible as to what type is deemed acceptable for launch.

Launch Templates can specify multiple possible instance types
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/create-launch-template.html#lt-instance-type

And Fleets in ec2 help flexibility by leveraging things like "attribute based instance type selection". https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-fleet-attribute-based-instance-type-selection.html

Describe the feature you'd like
Instance Scheduler solution should allow for optional use of ec2 Fleets and Launch Templates, as these are standard mechanisms that allow instance type flexibility and AZ flexibility. This will help keep everyone get back to work by launching their next preferred instance types on days when their first choice is not available.

Additional context

@CrypticCabub
Copy link
Member

Hi @playphil

Thanks for submitting this FR! The correct handling of ICE errors is an ongoing discussion, and you present some interesting ideas that I will bring to the rest of the team for consideration.

@playphil
Copy link
Author

Great ya, another way might also to be allow each to configure a small list of additional instance types, sizes or qualities to try if/when an ICE is received at time of attempting a re-launch of an existing instance that may not already be a part of a fleet nor launch template.

@ashraf133
Copy link

Hello
i have the same issue, you can maybe add a new tag that contain ec2 types catalog if the first encounter an error then it takes the second

@playphil
Copy link
Author

playphil commented Apr 4, 2024

Hello i have the same issue, you can maybe add a new tag that contain ec2 types catalog if the first encounter an error then it takes the second

Awesome idea, gives a simple way everyone could adopt incrementally. The optional additional tag values on some instances could then be attempted for launch anytime ICE is found in Cloudtrail. Example:
preferred-instance-types = m7i.xlarge,m6i.xlarge,m5a.xlarge,c7i.xlarge,r7i.xlarge

@ashraf133
Copy link

Hello, Any update?

@shujacks
Copy link
Member

Hi @playphil and @ashraf133, thanks for reaching out, and to help us prioritize items in our backlog, can you please let us know which company you are representing, and what your specific use cases are?

@playphil
Copy link
Author

@shujacks the use case is explained in the issue. Please review the full contents here with your tech lead that understands the nature of ec2 physical capacity constraints and ICE. This is important for all large customers having thousands of instances or more. Instance type flexibility becomes even more important during world events, natural disasters, and with instance types that are in short supply in a given AZ. Instance type flexibility is a core principle of scalable, reliable use of ec2 instances. The companies we currently represent is of no relevance to this issue.

@ashraf133
Copy link

@shujacks , we encounter everyday ec2 capacity problem when starting instances
An error occurred (InsufficientInstanceCapacity) when calling the StartInstances operation

The solution that we suggest is adding a new tag that contains a list of ec2 instances types and try to start using the first type, if it is ok then everything is good, else try with the second type ...
as mentionned by @playphil preferred-instance-types = m7i.xlarge,m6i.xlarge,m5a.xlarge,c7i.xlarge,r7i.xlarge

@shujacks
Copy link
Member

Hi @playphil @ashraf133 thanks for the response. We certainly understand the ec2 capacity issue. However, the product team requested to get this information to prioritize this feature request since we have a long backlog to evaluate: 1. size of the deployment 2. how long have you been using the solution 3. use case (e.g. for dev account? for testing purpose?).

The answers to these questions will help us prioritize all customer asks, thank you.

@ashraf133
Copy link

i use it for around 2000 ec2 and 300 rds since 2022 for all environments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants