Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-eks: manage nodegroups GPU instance types not up to date #31347

Open
1 task
AlexKaracaoglu opened this issue Sep 6, 2024 · 3 comments · May be fixed by #31445
Open
1 task

aws-eks: manage nodegroups GPU instance types not up to date #31347

AlexKaracaoglu opened this issue Sep 6, 2024 · 3 comments · May be fixed by #31445
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. effort/small Small work item – less than a day of effort p2

Comments

@AlexKaracaoglu
Copy link

Describe the bug

As a AWS CDK user for EKS, I want build out a managed node group with a mix of G5/G6/G6e instance types.

This is not possible with the current isGpuInstanceType check for managed node groups (here | architecture mapping here). When I specify instance types of both G5 and G6, I receive an error of instanceTypes of different architectures is not allowed due to G6/G6e not existing in the knownGpuInstanceTypes.

Additionally, there is currently no instance class setup currently in the EC2 package for G6e instances: https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-ec2/lib/instance-types.ts

Regression Issue

  • Select this option if this issue appears to be a regression.

Last Known Working CDK Version

No response

Expected Behavior

I expect to be able to build a managed node group with both G5/G6/G6e instance classes due to the architectures being compatible.

Current Behavior

The error 'instanceTypes of different architectures is not allowed is being thrown.

I cannot reference a G6e instance class.

Reproduction Steps

Create a managed node group with instance types from a mix of G5/G6 instance families and will see the error above.
Try to reference the G6e instance class - it does not exist.

Possible Solution

To fix the G5/G6 issue:


To enable G6e instance types and get those compatible as recognized GPU instance types:

Additional Information/Context

Happy to open a PR to handle this - just wanted to get the discussion going

CDK CLI Version

2.156.0

Framework Version

No response

Node.js Version

20.17.0

OS

macOS 14.6.1

Language

TypeScript

Language Version

Version 5.5.4

Other information

No response

@AlexKaracaoglu AlexKaracaoglu added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 6, 2024
@github-actions github-actions bot added the @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service label Sep 6, 2024
@khushail khushail added investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. needs-triage This issue or PR still needs to be triaged. labels Sep 6, 2024
@khushail khushail self-assigned this Sep 6, 2024
@pahud
Copy link
Contributor

pahud commented Sep 9, 2024

Your proposed solutions makes good sense to me. Are you interested to submit a PR for that?

@khushail khushail removed the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Sep 9, 2024
@khushail khushail removed their assignment Sep 9, 2024
@ashishdhingra ashishdhingra added the effort/small Small work item – less than a day of effort label Sep 10, 2024
@AlexKaracaoglu
Copy link
Author

Your proposed solutions makes good sense to me. Are you interested to submit a PR for that?

@pahud - definitely interested, I'll put something together over the upcoming days

@AlexKaracaoglu
Copy link
Author

@pahud - PR here: #31445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. effort/small Small work item – less than a day of effort p2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants