You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just installed kubeflow and tried to test pipeline. But i faced with updating pipeline issue on kubeflow ui.
I got 504 error after updating new pipeline. But adding new experiments was worked well.
So i took troubleshooting step among ml-pipeline related pods.
Below is logs that i checked.
I guess user request from outside of cluster reached to ml-pipeline but i couldn't update the request properly.
Please check what is root cause and how to solve it?
In case of updating new pipeline
ml-pipeline ui
POST /pipeline/apis/v2beta1/pipelines/upload?name=MNIST%20test-06&description=
Proxied request: /apis/v2beta1/pipelines/upload?name=MNIST%20test-06&description=
ml-pipeline
I0120 06:32:08.743223 7 pipeline_upload_server.go:94] Upload pipeline called
In case of adding new experiments
ml-pipeline ui
POST /pipeline/apis/v2beta1/experiments
Proxied request: /apis/v2beta1/experiments
GET /pipeline/apis/v2beta1/experiments/b5693384-68e7-461b-bc9e-7534107e77a6
Proxied request: /apis/v2beta1/experiments/b5693384-68e7-461b-bc9e-7534107e77a6
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi all,
I just installed kubeflow and tried to test pipeline. But i faced with updating pipeline issue on kubeflow ui.
I got 504 error after updating new pipeline. But adding new experiments was worked well.
So i took troubleshooting step among ml-pipeline related pods.
Below is logs that i checked.
I guess user request from outside of cluster reached to ml-pipeline but i couldn't update the request properly.
Please check what is root cause and how to solve it?
In case of updating new pipeline
ml-pipeline ui
POST /pipeline/apis/v2beta1/pipelines/upload?name=MNIST%20test-06&description=
Proxied request: /apis/v2beta1/pipelines/upload?name=MNIST%20test-06&description=
ml-pipeline
I0120 06:32:08.743223 7 pipeline_upload_server.go:94] Upload pipeline called
In case of adding new experiments
ml-pipeline ui
POST /pipeline/apis/v2beta1/experiments
Proxied request: /apis/v2beta1/experiments
GET /pipeline/apis/v2beta1/experiments/b5693384-68e7-461b-bc9e-7534107e77a6
Proxied request: /apis/v2beta1/experiments/b5693384-68e7-461b-bc9e-7534107e77a6
ml-pipeline
I0120 06:34:36.732813 7 interceptor.go:29] /kubeflow.pipelines.backend.api.v2beta1.ExperimentService/ListExperiments handler starting
I0120 06:34:36.745914 7 interceptor.go:37] /kubeflow.pipelines.backend.api.v2beta1.ExperimentService/ListExperiments handler finished
I0120 06:34:37.343723 7 interceptor.go:29] /kubeflow.pipelines.backend.api.v2beta1.RunService/ListRuns handler starting
I0120 06:34:37.343753 7 interceptor.go:29] /kubeflow.pipelines.backend.api.v2beta1.RunService/ListRuns handler starting
I0120 06:34:37.351711 7 interceptor.go:37] /kubeflow.pipelines.backend.api.v2beta1.RunService/ListRuns handler finished
I0120 06:34:37.358307 7 interceptor.go:37] /kubeflow.pipelines.backend.api.v2beta1.RunService/ListRuns handler finished
I0120 06:34:57.043117 7 interceptor.go:29] /kubeflow.pipelines.backend.api.v2beta1.ExperimentService/CreateExperiment handler starting
I0120 06:34:57.067850 7 interceptor.go:37] /kubeflow.pipelines.backend.api.v2beta1.ExperimentService/CreateExperiment handler finished
I0120 06:34:57.174117 7 interceptor.go:29] /kubeflow.pipelines.backend.api.v2beta1.ExperimentService/GetExperiment handler starting
I0120 06:34:57.175887 7 interceptor.go:37] /kubeflow.pipelines.backend.api.v2beta1.ExperimentService/GetExperiment handler finished
Below is pods under kubeflow namespace.
kubectl get pod -n kubeflow
NAME READY STATUS RESTARTS AGE
admission-webhook-deployment-5644dcc957-r5zjh 1/1 Running 0 4d1h
cache-deployer-deployment-5574f9d494-s4cp4 1/1 Running 0 4d1h
cache-server-864bfdbcfd-6zggq 1/1 Running 0 4d
centraldashboard-74fc94fcf4-7jhsh 2/2 Running 0 4d1h
controller-manager-57488687d6-zfw7m 1/1 Running 0 4d1h
jupyter-web-app-deployment-7dbcd448fb-9888p 2/2 Running 0 4d1h
katib-controller-7d6984668d-r949j 1/1 Running 0 4d1h
katib-db-manager-676776f9c-qxjwj 1/1 Running 1 (4d1h ago) 4d1h
katib-mysql-5c9cd9b95f-6msmm 1/1 Running 0 4d1h
katib-ui-6c6fc87849-s92cj 2/2 Running 0 4d1h
kserve-controller-manager-5f8c474f97-w6l99 2/2 Running 0 4d1h
kserve-localmodel-controller-manager-6f978d76bc-s2jvd 2/2 Running 0 4d1h
kserve-models-web-app-67f4b9dcfd-mx69m 2/2 Running 0 4d1h
kubeflow-pipelines-profile-controller-7b7b8f44f7-tjw9r 1/1 Running 0 4d1h
metacontroller-0 1/1 Running 0 3d23h
metadata-envoy-deployment-74dbc5bdcc-5d2z5 1/1 Running 0 3d23h
metadata-grpc-deployment-8496ffb98b-9qtzd 2/2 Running 2 (4d1h ago) 4d1h
metadata-writer-6864676f6b-5mzhp 2/2 Running 0 3d23h
minio-7c77bc59b8-tgjqr 2/2 Running 0 4d1h
ml-pipeline-65d6758bd9-dmk5t 2/2 Running 0 29m
ml-pipeline-persistenceagent-65c5c6dfc5-pj8mb 2/2 Running 0 3d23h
ml-pipeline-scheduledworkflow-759ff87cb9-hgbcz 2/2 Running 0 3d23h
ml-pipeline-ui-84558b44ff-4jvbr 2/2 Running 4 (3h26m ago) 3d23h
ml-pipeline-viewer-crd-7c74889db4-zcgjk 2/2 Running 1 (3d23h ago) 3d23h
ml-pipeline-visualizationserver-68b76fd5b6-j5dhw 2/2 Running 0 3d23h
mysql-758cd66576-mwqwn 2/2 Running 0 4d1h
notebook-controller-deployment-6545dbccf4-2n2xh 2/2 Running 2 (4d1h ago) 4d1h
profiles-deployment-5f46f7c9bb-xqbpz 3/3 Running 1 (4d1h ago) 4d1h
proxy-agent-64fbbbf96f-g6h4n 0/1 CrashLoopBackOff 1137 (93s ago) 4d
pvcviewer-controller-manager-55f545dfc4-vw2gb 3/3 Running 0 4d1h
tensorboard-controller-deployment-546b5886c5-lsg7l 3/3 Running 1 (4d1h ago) 4d1h
tensorboards-web-app-deployment-5bd559766d-mwm7l 2/2 Running 0 4d1h
training-operator-7f8bfd56f-ntlwn 1/1 Running 0 4d1h
volumes-web-app-deployment-5b558895d6-dh7sq 2/2 Running 0 4d1h
workflow-controller-7fcd696bb4-5hscm 1/1 Running 0 4d
Below is ml-pipeline pod information
kubectl get pods ml-pipeline-65d6758bd9-dmk5t -n kubeflow -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
cni.projectcalico.org/containerID: 06a3bfe0813286d1099c64f0d8bc3cbe5ecc849b2c8a6031ee33d01adbb267c2
cni.projectcalico.org/podIP: 10.233.119.31/32
cni.projectcalico.org/podIPs: 10.233.119.31/32
istio.io/rev: default
kubectl.kubernetes.io/default-container: ml-pipeline-api-server
kubectl.kubernetes.io/default-logs-container: ml-pipeline-api-server
prometheus.io/path: /stats/prometheus
prometheus.io/port: "15020"
prometheus.io/scrape: "true"
sidecar.istio.io/status: '{"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["workload-socket","credential-socket","workload-certs","istio-envoy","istio-data","istio-podinfo","istio-token","istiod-ca-cert"],"imagePullSecrets":null,"revision":"default"}'
creationTimestamp: "2025-01-20T06:07:02Z"
generateName: ml-pipeline-65d6758bd9-
labels:
app: ml-pipeline
application-crd-id: kubeflow-pipelines
pod-template-hash: 65d6758bd9
security.istio.io/tlsMode: istio
service.istio.io/canonical-name: ml-pipeline
service.istio.io/canonical-revision: latest
name: ml-pipeline-65d6758bd9-dmk5t
namespace: kubeflow
ownerReferences:
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: ml-pipeline-65d6758bd9
uid: 7a2d352b-2daa-489e-9af0-5f89437a0df2
resourceVersion: "18476063"
uid: 9d67c738-73e7-44c1-a2b5-f5f02970324c
spec:
containers:
value: debug
valueFrom:
configMapKeyRef:
key: autoUpdatePipelineDefaultVersion
name: pipeline-install-config
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
value: "false"
valueFrom:
configMapKeyRef:
key: bucketName
name: pipeline-install-config
valueFrom:
secretKeyRef:
key: username
name: mysql-secret
valueFrom:
secretKeyRef:
key: password
name: mysql-secret
valueFrom:
configMapKeyRef:
key: pipelineDb
name: pipeline-install-config
valueFrom:
configMapKeyRef:
key: dbHost
name: pipeline-install-config
valueFrom:
configMapKeyRef:
key: dbPort
name: pipeline-install-config
valueFrom:
configMapKeyRef:
key: ConMaxLifeTime
name: pipeline-install-config
valueFrom:
configMapKeyRef:
key: dbType
name: pipeline-install-config
valueFrom:
secretKeyRef:
key: username
name: mysql-secret
valueFrom:
secretKeyRef:
key: password
name: mysql-secret
valueFrom:
configMapKeyRef:
key: pipelineDb
name: pipeline-install-config
valueFrom:
configMapKeyRef:
key: mysqlHost
name: pipeline-install-config
valueFrom:
configMapKeyRef:
key: mysqlPort
name: pipeline-install-config
valueFrom:
secretKeyRef:
key: accesskey
name: mlpipeline-minio-artifact
valueFrom:
secretKeyRef:
key: secretkey
name: mlpipeline-minio-artifact
image: gcr.io/ml-pipeline/api-server:2.3.0
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
failureThreshold: 3
initialDelaySeconds: 3
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 2
name: ml-pipeline-api-server
ports:
name: http
protocol: TCP
name: grpc
protocol: TCP
readinessProbe:
exec:
command:
failureThreshold: 3
initialDelaySeconds: 3
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 2
resources:
requests:
cpu: 250m
memory: 500Mi
startupProbe:
exec:
command:
failureThreshold: 12
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 2
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
name: kube-api-access-bk5hk
readOnly: true
env:
value: istiod
value: istiod.istio-system.svc:15012
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.serviceAccountName
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
valueFrom:
resourceFieldRef:
divisor: "0"
resource: limits.cpu
value: |
{"tracing":{}}
value: |-
[
{"name":"http","containerPort":8888,"protocol":"TCP"}
,{"name":"grpc","containerPort":8887,"protocol":"TCP"}
]
value: ml-pipeline-api-server
valueFrom:
resourceFieldRef:
divisor: "0"
resource: limits.memory
valueFrom:
resourceFieldRef:
divisor: "0"
resource: limits.cpu
value: Kubernetes
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
value: REDIRECT
value: ml-pipeline
value: kubernetes://apis/apps/v1/namespaces/kubeflow/deployments/ml-pipeline
value: cluster.local
value: cluster.local
image: docker.io/istio/proxyv2:1.23.2
imagePullPolicy: IfNotPresent
name: istio-proxy
ports:
name: http-envoy-prom
protocol: TCP
readinessProbe:
failureThreshold: 4
httpGet:
path: /healthz/ready
port: 15021
scheme: HTTP
periodSeconds: 15
successThreshold: 1
timeoutSeconds: 3
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
privileged: false
readOnlyRootFilesystem: true
runAsGroup: 1337
runAsNonRoot: true
runAsUser: 1337
startupProbe:
failureThreshold: 600
httpGet:
path: /healthz/ready
port: 15021
scheme: HTTP
periodSeconds: 1
successThreshold: 1
timeoutSeconds: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
name: workload-socket
name: credential-socket
name: workload-certs
name: istiod-ca-cert
name: istio-data
name: istio-envoy
name: istio-token
name: istio-podinfo
name: kube-api-access-bk5hk
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
initContainers:
image: docker.io/istio/proxyv2:1.23.2
imagePullPolicy: IfNotPresent
name: istio-init
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
drop:
privileged: false
readOnlyRootFilesystem: false
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
name: kube-api-access-bk5hk
readOnly: true
nodeName: atc06
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: ml-pipeline
serviceAccountName: ml-pipeline
terminationGracePeriodSeconds: 30
tolerations:
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
name: workload-socket
name: credential-socket
name: workload-certs
medium: Memory
name: istio-envoy
name: istio-data
defaultMode: 420
items:
apiVersion: v1
fieldPath: metadata.labels
path: labels
apiVersion: v1
fieldPath: metadata.annotations
path: annotations
name: istio-podinfo
projected:
defaultMode: 420
sources:
audience: istio-ca
expirationSeconds: 43200
path: istio-token
defaultMode: 420
name: istio-ca-root-cert
name: istiod-ca-cert
projected:
defaultMode: 420
sources:
expirationSeconds: 3607
path: token
items:
path: ca.crt
name: kube-root-ca.crt
items:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
lastTransitionTime: "2025-01-20T06:07:03Z"
status: "True"
type: PodReadyToStartContainers
lastTransitionTime: "2025-01-20T06:07:03Z"
status: "True"
type: Initialized
lastTransitionTime: "2025-01-20T06:07:33Z"
status: "True"
type: Ready
lastTransitionTime: "2025-01-20T06:07:33Z"
status: "True"
type: ContainersReady
lastTransitionTime: "2025-01-20T06:07:02Z"
status: "True"
type: PodScheduled
containerStatuses:
image: docker.io/istio/proxyv2:1.23.2
imageID: docker.io/istio/proxyv2@sha256:2876cfc2fdf47e4b9665390ccc9ccf2bf913b71379325b8438135c9f35578e1a
lastState: {}
name: istio-proxy
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2025-01-20T06:07:30Z"
volumeMounts:
name: workload-socket
name: credential-socket
name: workload-certs
name: istiod-ca-cert
name: istio-data
name: istio-envoy
name: istio-token
name: istio-podinfo
name: kube-api-access-bk5hk
readOnly: true
recursiveReadOnly: Disabled
image: gcr.io/ml-pipeline/api-server:2.3.0
imageID: gcr.io/ml-pipeline/api-server@sha256:39661bd823e8ef93000082c8b9c4977705f05a6b42bc36d7a02d394d9dd82a78
lastState: {}
name: ml-pipeline-api-server
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2025-01-20T06:07:30Z"
volumeMounts:
name: kube-api-access-bk5hk
readOnly: true
recursiveReadOnly: Disabled
hostIP: 192.168.10.206
hostIPs:
initContainerStatuses:
image: docker.io/istio/proxyv2:1.23.2
imageID: docker.io/istio/proxyv2@sha256:2876cfc2fdf47e4b9665390ccc9ccf2bf913b71379325b8438135c9f35578e1a
lastState: {}
name: istio-init
ready: true
restartCount: 0
started: false
state:
terminated:
containerID: containerd://5c73b9a12e59e921336baebc8b8d92f258e03629ea9ef0d5eefe8c6200330b6e
exitCode: 0
finishedAt: "2025-01-20T06:07:03Z"
reason: Completed
startedAt: "2025-01-20T06:07:03Z"
volumeMounts:
name: kube-api-access-bk5hk
readOnly: true
recursiveReadOnly: Disabled
phase: Running
podIP: 10.233.119.31
podIPs:
qosClass: Burstable
startTime: "2025-01-20T06:07:02Z"
Beta Was this translation helpful? Give feedback.
All reactions