-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Karpenter is recreating the same type of node over a period of time and deleting it in parallel #1851
Comments
We were able to reproduce the issue, and it’s not a bug in Karpenter itself but rather an inefficiency of Karpenter’s scheduling with respect to the Kube-scheduler. Kube-scheduler have full control of pod placement, which causes conflicts with the Karpenter's consolidation calculation during scheduling. How to Reproduce:Deployments Configuration:
Node Pool Configuration:
Scale the Deployment:
Setup Complete:
What will happen next ?
Expected Scheduling by Karpenter:
Actual Scheduling by Kube-Scheduler
Observed Issue:
Root Cause:
Attached the deployment, pdb and nodepool manifests below. |
Agree with @sekar-saravanan that this is an unfortunate interaction between Karpenter and the kube-scheduler being different entities here and drain ordering playing some factor into which pods end-up where. Beyond Karpenter being the scheduler itself, we've talked about some mitigations for this issue. Most notably: the use of |
/triage accepted |
Increasing the |
Description
Observed Behavior:
Karpenter is recreating the same type of node over a period of time and deleting it in parallel when replicas are increased.
Expected Behavior:
Karpenter should not consolidate often when replicas are increased and new nodes are created , deleted over a period of time marking the pods to be shifted to new nodes.
Reproduction Steps (Please include YAML):
Versions:
kubectl version
): v1.31Attached logs :
karpenter-dec07.log
Questions :
The text was updated successfully, but these errors were encountered: