Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terminationMessagePath does not work with peer-pods #2220

Open
ldoktor opened this issue Dec 20, 2024 · 6 comments
Open

terminationMessagePath does not work with peer-pods #2220

ldoktor opened this issue Dec 20, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@ldoktor
Copy link
Contributor

ldoktor commented Dec 20, 2024

Describe the bug

Setting terminationMessagePath should result in the content of that file being stored as Message in the pod status, but with peer-pods it does nothing, while it works with kubernetes as well as with kata-qemu.

How to reproduce

cat <<\EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  terminationGracePeriodSeconds: 0
  restartPolicy: Never
  containers:
  - name: test
    image: busybox
    command:
    - sh
    - -cx
    - echo FOO > /tmp/foo
    terminationMessagePath: /tmp/foo
    terminatioNMessagePolicy: File
  runtimeClassName: kata-remote
EOF

# Wait for the pod to complete
kubectl describe pods/test

CoCo version information

cloud-api-adaptor-464f734ef28dd8f5f83fb5ef644dd73de3e409d7

What TEE are you seeing the problem on

None

Failing command and relevant log output

# Peer-pods
Containers:
  test:
    Container ID:  containerd://d03c5b1c1d20ec1ab41d8cf55fee29c52b58ac376a305b4567171057e22171ae
    Image:         busybox
    Image ID:      docker.io/library/busybox@sha256:2919d0172f7524b2d8df9e50066a682669e6d170ac0f6a49676d54358fe970b5
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -cx
      echo FOO > /tmp/foo
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 20 Dec 2024 10:00:13 +0100
      Finished:     Fri, 20 Dec 2024 10:00:14 +0100
    Ready:          False


# Default pod
Containers:
  test:
    Container ID:  containerd://244019d356d7a2d421b71c3591704db7ec49d6078a8d114d8583f707863cf8d4
    Image:         busybox
    Image ID:      docker.io/library/busybox@sha256:2919d0172f7524b2d8df9e50066a682669e6d170ac0f6a49676d54358fe970b5
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -cx
      echo FOO > /tmp/foo
    State:      Terminated
      Reason:   Completed
      Message:  FOO

      Exit Code:    0
      Started:      Fri, 20 Dec 2024 10:26:42 +0100
      Finished:     Fri, 20 Dec 2024 10:26:42 +0100
@ldoktor ldoktor added the bug Something isn't working label Dec 20, 2024
@ldoktor
Copy link
Contributor Author

ldoktor commented Dec 20, 2024

Forgot to mention this is most likely the reason some of the test failures from #833

@stevenhorsman
Copy link
Member

Maybe (as a guess) this could be shared_fs=none related, so it would be interesting to know if this works on the coco-dev runtimeclass for bare-metal?

@wainersm
Copy link
Member

Maybe (as a guess) this could be shared_fs=none related, so it would be interesting to know if this works on the coco-dev runtimeclass for bare-metal?

I just gave it a try with kata-qemu-coco-dev, it works (using CoCo 0.11.0):

$ kubectl describe pod test
Name:                test
Namespace:           default
Priority:            0
Runtime Class Name:  kata-qemu-coco-dev
Service Account:     default
Node:                coco-play-control-plane/172.18.0.2
Start Time:          Wed, 15 Jan 2025 14:33:50 -0300
Labels:              <none>
Annotations:         <none>
Status:              Succeeded
IP:                  10.244.0.20
IPs:
  IP:  10.244.0.20
Containers:
  test:
    Container ID:  containerd://6720c4eae47d22a363c507d81e15bf639e5a51d547a5b6d20987240857807c3c
    Image:         quay.io/prometheus/busybox:latest
    Image ID:      quay.io/prometheus/busybox@sha256:dfa54ef35e438b9e71ac5549159074576b6382f95ce1a434088e05fd6b730bc4
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -cx
      echo FOO > /tmp/foo
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 15 Jan 2025 14:33:59 -0300
      Finished:     Wed, 15 Jan 2025 14:33:59 -0300
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cmtsc (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  kube-api-access-cmtsc:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              katacontainers.io/kata-runtime=true
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  3m12s  default-scheduler  Successfully assigned default/test to coco-play-control-plane
  Normal  Pulling    3m9s   kubelet            Pulling image "quay.io/prometheus/busybox:latest"
  Normal  Pulled     3m8s   kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 947ms (947ms including waiting). Image size: 1267792 bytes.
  Normal  Created    3m8s   kubelet            Created container test
  Normal  Started    3m3s   kubelet            Started container test

@wainersm
Copy link
Member

Maybe (as a guess) this could be shared_fs=none related, so it would be interesting to know if this works on the coco-dev runtimeclass for bare-metal?

I just gave it a try with kata-qemu-coco-dev, it works (using CoCo 0.11.0):

$ kubectl describe pod test
Name:                test
Namespace:           default
Priority:            0
Runtime Class Name:  kata-qemu-coco-dev
Service Account:     default
Node:                coco-play-control-plane/172.18.0.2
Start Time:          Wed, 15 Jan 2025 14:33:50 -0300
Labels:              <none>
Annotations:         <none>
Status:              Succeeded
IP:                  10.244.0.20
IPs:
  IP:  10.244.0.20
Containers:
  test:
    Container ID:  containerd://6720c4eae47d22a363c507d81e15bf639e5a51d547a5b6d20987240857807c3c
    Image:         quay.io/prometheus/busybox:latest
    Image ID:      quay.io/prometheus/busybox@sha256:dfa54ef35e438b9e71ac5549159074576b6382f95ce1a434088e05fd6b730bc4
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -cx
      echo FOO > /tmp/foo
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 15 Jan 2025 14:33:59 -0300
      Finished:     Wed, 15 Jan 2025 14:33:59 -0300
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cmtsc (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  kube-api-access-cmtsc:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              katacontainers.io/kata-runtime=true
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  3m12s  default-scheduler  Successfully assigned default/test to coco-play-control-plane
  Normal  Pulling    3m9s   kubelet            Pulling image "quay.io/prometheus/busybox:latest"
  Normal  Pulled     3m8s   kubelet            Successfully pulled image "quay.io/prometheus/busybox:latest" in 947ms (947ms including waiting). Image size: 1267792 bytes.
  Normal  Created    3m8s   kubelet            Created container test
  Normal  Started    3m3s   kubelet            Started container test

Actually, it seems it didn't work... should I see the Message: FOO in the output above? ^^ @ldoktor

@ldoktor
Copy link
Contributor Author

ldoktor commented Jan 17, 2025

Yep, it didn't work. The message is a nice way to pass arbitrary data (like json ;-) ) to following processes without the need for PVCs. As PVCs are quite complicated with peer-pods (so far I haven't found a direct analogy to how they are handled outside peer-pods) it'd be nice to have this working.

@wainersm
Copy link
Member

Hi @ldoktor , I just reported this problem on kata-containers side too as it will be needed to fix for the CoCo bare-metal case.

On kata bare-metal, when shared_fs=none, there is a thread watching changes on /var/lib/kubelet/pods/$ID/volumes/ then copying over the guest; that's how ConfigMap, Secret, etc... still works. How does it work with peer pods? I'm not saying that the termination message file is shared as a volume, it seems not indeed, but the solution will be like copying the file from guest to host maybe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants