You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using Triton to serve a BLS model. Inside the model.py file for this BLS model, there is a function that uses the triton gRPC client to query another model hosted on the server. While this process works correctly, the issue arises in the execute function. When the final output tensor is extracted, I attempt to cast it as a pb_utils.Tensor object and append it to the InferenceResponse class as documented. However, during the pb_utils.Tensor casting, a Segmentation fault error occurs.
My triton inference server docker image is 24.07-py3
Cuda is 12.5
Error Stack
Output tensor has been extracted (1536, 1536)
Final output type: <class 'numpy.ndarray'>
Final output shape: (1536, 1536), dtype: uint8
Signal (11) received.
0# 0x00005C1039DE580D in tritonserver
1# 0x0000758CBE932520 in /usr/lib/x86_64-linux-gnu/libc.so.6
2# 0x0000758CB535CBD5 in /opt/tritonserver/backends/python/libtriton_python.so
3# 0x0000758CB53604F2 in /opt/tritonserver/backends/python/libtriton_python.so
4# 0x0000758CB5360943 in /opt/tritonserver/backends/python/libtriton_python.so
5# 0x0000758CB533DFF7 in /opt/tritonserver/backends/python/libtriton_python.so
6# TRITONBACKEND_ModelInstanceExecute in /opt/tritonserver/backends/python/libtriton_python.so
7# 0x0000758CBD311944 in /opt/tritonserver/bin/../lib/libtritonserver.so
8# 0x0000758CBD311CBB in /opt/tritonserver/bin/../lib/libtritonserver.so
9# 0x0000758CBD42D23D in /opt/tritonserver/bin/../lib/libtritonserver.so
10# 0x0000758CBD3160F4 in /opt/tritonserver/bin/../lib/libtritonserver.so
11# 0x0000758CBF01A253 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
12# 0x0000758CBE984AC3 in /usr/lib/x86_64-linux-gnu/libc.so.6
13# clone in /usr/lib/x86_64-linux-gnu/libc.so.6
I've confirmed that the dtype, tensor shape that the config.pbtxt for the BLS model is in alignment with what is being sent in the execute function.
I reference my custom execution environment in config.pbtxt (the tarball), and I have a custom triton_python_backend_stub. If possible could you assist me in finding where the source of the error is coming from.
The text was updated successfully, but these errors were encountered:
We are using Triton to serve a BLS model. Inside the model.py file for this BLS model, there is a function that uses the triton gRPC client to query another model hosted on the server. While this process works correctly, the issue arises in the execute function. When the final output tensor is extracted, I attempt to cast it as a pb_utils.Tensor object and append it to the InferenceResponse class as documented. However, during the pb_utils.Tensor casting, a Segmentation fault error occurs.
My triton inference server docker image is 24.07-py3
Cuda is 12.5
Error Stack
I've confirmed that the dtype, tensor shape that the config.pbtxt for the BLS model is in alignment with what is being sent in the execute function.
Here is the execute function that is throwing the error in model.py:
My model repository is as shown:
I reference my custom execution environment in config.pbtxt (the tarball), and I have a custom triton_python_backend_stub. If possible could you assist me in finding where the source of the error is coming from.
The text was updated successfully, but these errors were encountered: