Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter does not drain the node before sending shutdown signal #1894

Open
ShubhamKr11 opened this issue Dec 23, 2024 · 10 comments
Open

Karpenter does not drain the node before sending shutdown signal #1894

ShubhamKr11 opened this issue Dec 23, 2024 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@ShubhamKr11
Copy link

Description

Observed Behavior:

  • Karpenter is not draining the node before sending node shutdown signal to the kubelet.
  • Attaching kubelet logs for a node & karpenter logs related to the same node. Please note the timeline for both the logs.
  • Providing related logs by Kubelet:
2024-12-03 14:31:59.602	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.602338    2220 nodeshutdown_manager_linux.go:265] \"Shutdown manager detected new shutdown event, isNodeShuttingDownNow\" event=true"}
2024-12-03 14:31:59.602	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.602393    2220 nodeshutdown_manager_linux.go:322] \"Shutdown manager processing shutdown event\""}
2024-12-03 14:31:59.604	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.604475    2220 kubelet_node_status.go:669] \"Recording event message for node\" node=\"i-xxx\" event=\"NodeNotReady\""}
2024-12-03 14:31:59.604	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.604510    2220 setters.go:552] \"Node became not ready\" node=\"i-xxx\" condition={\"type\":\"Ready\",\"status\":\"False\",\"lastHeartbeatTime\":\"2024-12-03T09:01:59Z\",\"lastTransitionTime\":\"2024-12-03T09:01:59Z\",\"reason\":\"KubeletNotReady\",\"message\":\"node is shutting down\"}"}
2024-12-03 14:31:59.605	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.605119    2220 nodeshutdown_manager_linux.go:375] \"Shutdown manager killing pod with gracePeriod\" pod=\"kube-system/kube-proxy-i-xxx\" gracePeriod=20"}
2024-12-03 14:31:59.605	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.605294    2220 kuberuntime_container.go:745] \"Killing container with a grace period\" pod=\"kube-system/kube-proxy-i-xxx\" podUID=\"e39ab0aac325868d61054ba7f351a6fe\" containerName=\"kube-proxy\" containerID=\"containerd://3e44355e38045e3c954ec8b4f38d022c65c17ed5a21f181330b6ac6b55cc199f\" gracePeriod=20"}
2024-12-03 14:31:59.605	{"stime":"Dec  3 09:01:59","pid":"1933","message":"time=\"2024-12-03T09:01:59.605566402Z\" level=info msg=\"StopContainer for \\\"3e44355e38045e3c954ec8b4f38d022c65c17ed5a21f181330b6ac6b55cc199f\\\" with timeout 20 (s)\""}
2024-12-03 14:31:59.606	{"stime":"Dec  3 09:01:59","pid":"1933","message":"time=\"2024-12-03T09:01:59.605917479Z\" level=info msg=\"Stop container \\\"3e44355e38045e3c954ec8b4f38d022c65c17ed5a21f181330b6ac6b55cc199f\\\" with signal terminated\""}
2024-12-03 14:31:59.719	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.719789    2220 nodeshutdown_manager_linux.go:395] \"Shutdown manager finished killing pod\" pod=\"kube-system/kube-proxy-i-xxx\""}
------ similar logs for other pods ------ 
2024-12-03 14:31:59.720	{"stime":"Dec  3 09:01:59","pid":"2220","message":"I1203 09:01:59.719828    2220 nodeshutdown_manager_linux.go:375] \"Shutdown manager killing pod with gracePeriod\" pod=\"logging/fluent-bit-sgkn4\" gracePeriod=10"}
...
  • Providing related logs by Karpenter:
2024-12-03 14:29:40.306	{"host":"*.*.*.*","log":"{\"level\":\"DEBUG\",\"time\":\"2024-12-03T08:59:40.305Z\",\"logger\":\"controller\",\"caller\":\"disruption/controller.go:91\",\"message\":\"marking consolidatable\",\"commit\":\"6174c75\",\"controller\":\"nodeclaim.disruption\",\"controllerGroup\":\"karpenter.sh\",\"controllerKind\":\"NodeClaim\",\"NodeClaim\":{\"name\":\"karpenter-worker-nodes-1-xxx\"},\"namespace\":\"\",\"name\":\"karpenter-worker-nodes-1-xxx\",\"reconcileID\":\"3dc66943-d53f-431b-910d-28b3cdb48b46\"}","stime":"2024-12-03T08:59:40.305913753Z"}
2024-12-03 14:30:27.577	{"host":"*.*.*.*","log":"{\"level\":\"INFO\",\"time\":\"2024-12-03T09:00:27.577Z\",\"logger\":\"controller\",\"caller\":\"disruption/controller.go:183\",\"message\":\"disrupting nodeclaim(s) via delete, terminating 1 nodes (3 pods) i-xxx/c6a.4xlarge/on-demand\",\"commit\":\"6174c75\",\"controller\":\"disruption\",\"namespace\":\"\",\"name\":\"\",\"reconcileID\":\"fe254e07-a4da-49a3-b57b-38ffa4f46f05\",\"command-id\":\"cdbcfbba-f6ba-46f4-b53a-baa5b9bbbb29\",\"reason\":\"underutilized\"}","stime":"2024-12-03T09:00:27.577206208Z"}
2024-12-03 14:30:27.699	{"host":"*.*.*.*","log":"{\"level\":\"DEBUG\",\"time\":\"2024-12-03T09:00:27.698Z\",\"logger\":\"controller\",\"caller\":\"singleton/controller.go:26\",\"message\":\"command succeeded\",\"commit\":\"6174c75\",\"controller\":\"disruption.queue\",\"namespace\":\"\",\"name\":\"\",\"reconcileID\":\"991b7173-caf1-447f-87bc-8ead2cb33fc4\",\"command-id\":\"cdbcfbba-f6ba-46f4-b53a-baa5b9bbbb29\"}","stime":"2024-12-03T09:00:27.699034416Z"}
2024-12-03 14:30:27.721	{"host":"*.*.*.*","log":"{\"level\":\"INFO\",\"time\":\"2024-12-03T09:00:27.721Z\",\"logger\":\"controller\",\"caller\":\"termination/controller.go:105\",\"message\":\"tainted node\",\"commit\":\"6174c75\",\"controller\":\"node.termination\",\"controllerGroup\":\"\",\"controllerKind\":\"Node\",\"Node\":{\"name\":\"i-05aca8638c296692b\"},\"namespace\":\"\",\"name\":\"i-xxx\",\"reconcileID\":\"e1161f8b-9d53-4a72-ae5e-93ba019ce257\",\"taint.Key\":\"karpenter.sh/disrupted\",\"taint.Value\":\"\",\"taint.Effect\":\"NoSchedule\"}","stime":"2024-12-03T09:00:27.721565969Z"}
2024-12-03 14:32:40.974	{"host":"*.*.*.*","log":"{\"level\":\"INFO\",\"time\":\"2024-12-03T09:02:40.974Z\",\"logger\":\"controller\",\"caller\":\"termination/controller.go:165\",\"message\":\"deleted node\",\"commit\":\"6174c75\",\"controller\":\"node.termination\",\"controllerGroup\":\"\",\"controllerKind\":\"Node\",\"Node\":{\"name\":\"i-05aca8638c296692b\"},\"namespace\":\"\",\"name\":\"i-xxx\",\"reconcileID\":\"4f9983ef-e372-4702-835e-fac3da09baff\"}","stime":"2024-12-03T09:02:40.974320106Z"}
2024-12-03 14:32:41.313	{"host":"*.*.*.*","log":"{\"level\":\"INFO\",\"time\":\"2024-12-03T09:02:41.312Z\",\"logger\":\"controller\",\"caller\":\"termination/controller.go:79\",\"message\":\"deleted nodeclaim\",\"commit\":\"6174c75\",\"controller\":\"nodeclaim.termination\",\"controllerGroup\":\"karpenter.sh\",\"controllerKind\":\"NodeClaim\",\"NodeClaim\":{\"name\":\"karpenter-worker-nodes-1-xxx\"},\"namespace\":\"\",\"name\":\"karpenter-worker-nodes-1-xxx\",\"reconcileID\":\"085b45ab-5d86-4150-b046-2526aaf9f5ab\",\"Node\":{\"name\":\"i-xxx\"},\"provider-id\":\"aws:///ap-south-1c/i-xxx\"}","stime":"2024-12-03T09:02:41.313115195Z"}

Expected Behavior:

  • As it is mentioned in the Karpenter doc, it should first cordon & drain the node & then only, node termination should be triggered.

Versions:

  • Chart Version: v1.0.6
  • Kubernetes Version (kubectl version): 1.28.12
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@ShubhamKr11 ShubhamKr11 added the kind/bug Categorizes issue or PR as related to a bug. label Dec 23, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Dec 23, 2024
@jmdeal
Copy link
Member

jmdeal commented Dec 23, 2024

Can you share the spec of the pod in question? If the pod tolerates Karpenter's termination taint, it's going to be unable to drain it.

/triage needs-information

@k8s-ci-robot k8s-ci-robot added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 23, 2024
@ShubhamKr11
Copy link
Author

Are you asking about the karpenter's pod yaml? Please clarify @jmdeal

@jmdeal
Copy link
Member

jmdeal commented Dec 23, 2024

Yes

@ShubhamKr11
Copy link
Author

apiVersion: v1
kind: Pod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: karpenter.sh/nodepool
            operator: DoesNotExist
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app.kubernetes.io/instance: karpenter
            app.kubernetes.io/name: karpenter
        topologyKey: kubernetes.io/hostname
  containers:
  - env:
    - name: KUBERNETES_MIN_VERSION
      value: 1.19.0-0
    - name: KARPENTER_SERVICE
      value: karpenter
    - name: WEBHOOK_PORT
      value: "8443"
    - name: WEBHOOK_METRICS_PORT
      value: "8001"
    - name: DISABLE_WEBHOOK
      value: "false"
    - name: LOG_LEVEL
      value: debug
    - name: LOG_OUTPUT_PATHS
      value: stdout
    - name: LOG_ERROR_OUTPUT_PATHS
      value: stderr
    - name: METRICS_PORT
      value: "8080"
    - name: HEALTH_PROBE_PORT
      value: "8081"
    - name: SYSTEM_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: MEMORY_LIMIT
      valueFrom:
        resourceFieldRef:
          containerName: controller
          divisor: "0"
          resource: limits.memory
    - name: FEATURE_GATES
      value: SpotToSpotConsolidation=false
    - name: BATCH_MAX_DURATION
      value: 10s
    - name: BATCH_IDLE_DURATION
      value: 1s
    - name: CLUSTER_NAME
      value: ***
    - name: CLUSTER_ENDPOINT
      value: ***
    - name: VM_MEMORY_OVERHEAD_PERCENT
      value: "0.075"
    - name: INTERRUPTION_QUEUE
      value: Karpenter-***
    - name: RESERVED_ENIS
      value: "0"
    - name: AWS_SHARED_CREDENTIALS_FILE
      value: /meta/aws-iam/credentials.process
    - name: AWS_CREDENTIAL_PROFILES_FILE
      value: /meta/aws-iam/credentials
    image: karpenter-controller:alpine-1.0.6
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /healthz
        port: http
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 30
    name: controller
    ports:
    - containerPort: 8080
      name: http-metrics
      protocol: TCP
    - containerPort: 8001
      name: webhook-metrics
      protocol: TCP
    - containerPort: 8443
      name: https-webhook
      protocol: TCP
    - containerPort: 8081
      name: http
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /readyz
        port: http
        scheme: HTTP
      initialDelaySeconds: 5
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 30
    resources:
      limits:
        memory: 1300Mi
      requests:
        cpu: "1"
        memory: 1300Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
      runAsGroup: 65532
      runAsNonRoot: true
      runAsUser: 65532
      seccompProfile:
        type: RuntimeDefault
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /meta/aws-iam
      name: aws-iam-credentials
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-kmhlg
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: i-xxxxxx
  nodeSelector:
    kops.k8s.io/instancegroup: monitoring-nodes-1
    kubernetes.io/os: linux
  preemptionPolicy: PreemptLowerPriority
  priority: 2000000000
  priorityClassName: system-cluster-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 65532
  serviceAccount: karpenter
  serviceAccountName: karpenter
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: node
    operator: Equal
    value: monitoring
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  topologySpreadConstraints:
  - labelSelector:
      matchLabels:
        app.kubernetes.io/instance: karpenter
        app.kubernetes.io/name: karpenter
    maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
  volumes:
  - name: aws-iam-credentials
    secret:
      defaultMode: 420
      secretName: karpenter
  - name: kube-api-access-kmhlg
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace

@jmdeal
Copy link
Member

jmdeal commented Dec 23, 2024

Sorry, was reading from mobile and missed "Karpenter's" in your message. I meant the pod which was evicted, not the Karpenter controller pod.

@ShubhamKr11
Copy link
Author

Apologies for the confusion. The issue is not about a single pod being evicted. The problem lies with Karpenter's working design. Ideally, Karpenter should handle the draining of the node before sending the shutdown signal to the kubelet. However, as I can observe, the draining process is entirely carried out by the kubelet after Karpenter disrupts the node. You can refer the logs for the same.

@jmdeal
Copy link
Member

jmdeal commented Dec 26, 2024

As long as the pod does not tolerate Karpenter's disruption taint, Karpenter will drain the pod before terminating the instance. If you can share the pod spec that you believe should have been drained, we can determine if Karpenter should have acted upon it.

@ShubhamKr11
Copy link
Author

Here is a pod yaml which wasn't drained before Node Shutdown signal. Attached some logs below too which might be helpful here.

apiVersion: v1
kind: Pod
metadata:
  annotations:
    sidecar.istio.io/inject: "false"
    vpaObservedContainers: guardzilla-***, push-heap-dump
    vpaUpdates: 'Pod resources updated by guardzilla-***: container 0: cpu
      request, memory request, memory limit; container 1: '
  generateName: guardzilla***
  labels:
    Environment: prod
  name: guardzilla-***
  namespace: prod
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: ***
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.kubernetes.io/zone
            operator: In
            values:
            - ap-south-1a
            - ap-south-1b
            - ap-south-1c
  containers:
  - env:
    - name: secretMd5
      value: ***
    - name: CONTAINER_IMAGE
      value: ***
    - name: APP_NAME
      value: ***
    - name: NAMESPACE
      value: prod
    envFrom:
    - secretRef:
        name: ***
    image: ***
    imagePullPolicy: IfNotPresent
    lifecycle:
      preStop:
        exec:
          command:
          - sleep
          - "72"
    livenessProbe:
      failureThreshold: 10
      httpGet:
        path: ***
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 60
      periodSeconds: 30
      successThreshold: 1
      timeoutSeconds: 1
    name: guardzilla-***
    ports:
    - containerPort: ***
      protocol: TCP
    - containerPort: ***
      protocol: TCP
    readinessProbe:
      failureThreshold: 2
      httpGet:
        path: ***
        port: ***
        scheme: HTTP
      initialDelaySeconds: 60
      periodSeconds: 30
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        memory: 2560Mi
      requests:
        cpu: "2"
        memory: 2560Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /dumps
      name: heap-dumps
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: ***
      readOnly: true
  - env:
    - name: AWS_DEFAULT_REGION
      value: ***
    - name: SERVICE_NAME
      value: ***
    - name: S3_BUCKET
      value: ***
    - name: ENVIRONMENT
      value: ***
    - name: AWS_SHARED_CREDENTIALS_FILE
      value: ***
    - name: AWS_CREDENTIAL_PROFILES_FILE
      value: ***
    image: ***
    imagePullPolicy: IfNotPresent
    name: push-heap-dump
    resources:
      limits:
        memory: 128Mi
      requests:
        cpu: 100m
        memory: 128Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /meta/aws-iam
      name: aws-iam-credentials-heap-dump
      readOnly: true
    - mountPath: /dumps
      name: heap-dumps
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: ***
      readOnly: true
  dnsConfig:
    options:
    - name: ***
      value: "2"
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: i-***
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 90
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  topologySpreadConstraints:
  - labelSelector:
      matchLabels:
        app: ***
        release: ***
    maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
  volumes:
  - emptyDir: {}
    name: heap-dumps
  - name: aws-iam-credentials-heap-dump
    secret:
      defaultMode: 420
      secretName: ***
  - name: kube-api-access-pkjpm
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: ***
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace

Actually, kubelet logs for the node on which above pod was scheduled has some error while updating node status after getting NodeShutdown signal. Attaching logs for your reference.

2024-11-30 20:42:02.782	{"stime":"Nov 30 15:12:02","pid":"2227","message":"I1130 15:12:02.781961    2227 nodeshutdown_manager_linux.go:265] \"Shutdown manager detected new shutdown event, isNodeShuttingDownNow\" event=true"}
2024-11-30 20:42:02.782	{"stime":"Nov 30 15:12:02","pid":"2227","message":"I1130 15:12:02.782023    2227 nodeshutdown_manager_linux.go:322] \"Shutdown manager processing shutdown event\""}
2024-11-30 20:42:02.783	{"stime":"Nov 30 15:12:02","pid":"2227","message":"E1130 15:12:02.783425    2227 kubelet_node_status.go:540] \"Error updating node status, will retry\" err=\"error getting node \\\"i-***\\\": nodes \\\"i-***\\\" not found\""}

2024-11-30 20:42:02.785 {"stime":"Nov 30 15:12:02","pid":"2227","message":"I1130 15:12:02.784613    2227 nodeshutdown_manager_linux.go:375] \"Shutdown manager killing pod with gracePeriod\" pod=\"kube-system/kube-proxy-i-***\" gracePeriod=20"}

2024-11-30 20:42:02.787	{"stime":"Nov 30 15:12:02","pid":"2227","message":"I1130 15:12:02.786271    2227 kuberuntime_container.go:745] \"Killing container with a grace period\" pod=\"prod/guardzilla-***\" podUID=\"49425c6f-***-***\" containerName=\"push-heap-dump\" containerID=\"containerd://f8a47472***\" gracePeriod=20"}

2024-11-30 20:42:09.490	{"message":"E1130 15:12:09.490504    2227 nodelease.go:49] \"Failed to get node when trying to set owner ref to the node lease\" err=\"nodes \\\"i-***\\\" not found\" node=\"i-***\"","pid":"2227","stime":"Nov 30 15:12:09"}

2024-11-30 20:42:22.786 {"pid":"2227","stime":"Nov 30 15:12:22","message":"I1130 15:12:22.786365    2227 kuberuntime_container.go:644] \"PreStop hook not completed in grace period\" pod=\"prod/guardzilla-***\" podUID=\"49425c6f-****-***\" containerName=\"guardzilla-**\" containerID=\"containerd://7b33fc***\" gracePeriod=20"}

2024-11-30 20:42:22.787	{"pid":"2227","stime":"Nov 30 15:12:22","message":"I1130 15:12:22.786393    2227 kuberuntime_container.go:745] \"Killing container with a grace period\" pod=\"prod/guardzilla-***\" podUID=\"49425c6f-****-***\" containerName=\"guardzilla-***\" containerID=\"containerd://7b33fc69c3c15***\" gracePeriod=2"}

Also, Karpenter has also some different logs related to this node:

2024-11-30 20:42:02.014	{"host":"*.*.*.*","log":"{\"level\":\"DEBUG\",\"time\":\"2024-11-30T15:12:02.014Z\",\"logger\":\"controller\",\"caller\":\"lifecycle/controller.go:111\",\"message\":\"terminating due to registration ttl\",\"commit\":\"6174c75\",\"controller\":\"nodeclaim.lifecycle\",\"controllerGroup\":\"karpenter.sh\",\"controllerKind\":\"NodeClaim\",\"NodeClaim\":{\"name\":\"karpenter-worker-nodes-1-ldch4\"},\"namespace\":\"\",\"name\":\"karpenter-worker-nodes-1-ldch4\",\"reconcileID\":\"292cb2bb-716b-4674-8eb1-4ca28086e1be\",\"ttl\":\"15m0s\"}","stime":"2024-11-30T15:12:02.014325152Z"}

2024-11-30 20:42:09.510	{"host":"*.*.*.*","log":"{\"level\":\"DEBUG\",\"time\":\"2024-11-30T15:12:09.510Z\",\"logger\":\"controller\",\"caller\":\"reconcile/reconcile.go:142\",\"message\":\"found and delete leaked lease\",\"commit\":\"6174c75\",\"controller\":\"lease.garbagecollection\",\"controllerGroup\":\"coordination.k8s.io\",\"controllerKind\":\"Lease\",\"Lease\":{\"name\":\"i-***\",\"namespace\":\"kube-node-lease\"},\"namespace\":\"kube-node-lease\",\"name\":\"i-***\",\"reconcileID\":\"10b38274-675c-4c34-8708-4e86a7f46c00\"}","stime":"2024-11-30T15:12:09.510701374Z"}

2024-11-30 20:42:19.547	{"host":"*.*.*.*","log":"{\"level\":\"DEBUG\",\"time\":\"2024-11-30T15:12:19.547Z\",\"logger\":\"controller\",\"caller\":\"reconcile/reconcile.go:142\",\"message\":\"found and delete leaked lease\",\"commit\":\"6174c75\",\"controller\":\"lease.garbagecollection\",\"controllerGroup\":\"coordination.k8s.io\",\"controllerKind\":\"Lease\",\"Lease\":{\"name\":\"i-***\",\"namespace\":\"kube-node-lease\"},\"namespace\":\"kube-node-lease\",\"name\":\"i-***\",\"reconcileID\":\"52bd4f42-d80b-4829-a923-ad33513f1cc0\"}","stime":"2024-11-30T15:12:19.547542209Z"}

2024-11-30 20:42:29.639	{"host":"*.*.*.*","log":"{\"level\":\"DEBUG\",\"time\":\"2024-11-30T15:12:29.639Z\",\"logger\":\"controller\",\"caller\":\"reconcile/reconcile.go:142\",\"message\":\"found and delete leaked lease\",\"commit\":\"6174c75\",\"controller\":\"lease.garbagecollection\",\"controllerGroup\":\"coordination.k8s.io\",\"controllerKind\":\"Lease\",\"Lease\":{\"name\":\"i-***\",\"namespace\":\"kube-node-lease\"},\"namespace\":\"kube-node-lease\",\"name\":\"i-***\",\"reconcileID\":\"64b3a224-9b5c-4af3-b9b1-5344c7a653d3\"}","stime":"2024-11-30T15:12:29.639290258Z"}

2024-11-30 20:43:41.297	{"host":"*.*.*.*","log":"{\"level\":\"INFO\",\"time\":\"2024-11-30T15:13:41.297Z\",\"logger\":\"controller\",\"caller\":\"termination/controller.go:79\",\"message\":\"deleted nodeclaim\",\"commit\":\"6174c75\",\"controller\":\"nodeclaim.termination\",\"controllerGroup\":\"karpenter.sh\",\"controllerKind\":\"NodeClaim\",\"NodeClaim\":{\"name\":\"karpenter-worker-nodes-1-ldch4\"},\"namespace\":\"\",\"name\":\"karpenter-worker-nodes-1-ldch4\",\"reconcileID\":\"327920a8-0d80-484e-b25a-5303d5b35d5e\",\"Node\":{\"name\":\"\"},\"provider-id\":\"aws:///ap-south-1a/i-***\"}","stime":"2024-11-30T15:13:41.297421633Z"}

Copy link

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 11, 2025
@ShubhamKr11
Copy link
Author

Please provide some update here.

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants