Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SP1_PROVER=cuda fibonacci example not working on A100, T4 or RTX3090 #2021

Open
2 tasks done
eduadiez opened this issue Feb 3, 2025 · 1 comment
Open
2 tasks done

Comments

@eduadiez
Copy link

eduadiez commented Feb 3, 2025

Component

cargo prove CLI/sp1up

Have you ensured that all of these are up to date?

  • SP1 SDK
  • cargo prove CLI/sp1up

What version of SP1 SDK are you on?

No response

What version of the cargo prove CLI are you on?

No response

Operating System

Linux (Ubuntu)

Describe the bug

I've tried running the default example unchanged on different GPUs but I have only been able to get it to work on the L4 cards.

$ SP1_PROVER=cuda RUST_LOG=info cargo run --release -- --prove
warning: [email protected]: rustc +succinct --version: "rustc 1.82.0-dev\n"
warning: [email protected]: fibonacci-program built at 2025-02-03 15:35:20
    Finished `release` profile [optimized] target(s) in 0.37s
     Running `/home/edu/fibonacci/target/release/fibonacci --prove`
2025-02-03T15:39:26.305181Z  INFO vk verification: true
2025-02-03T15:39:35.263222Z  INFO vk verification: true
n: 20
fatal runtime error: Rust cannot catch foreign exceptions
2025-02-03T15:39:39.390994Z  INFO setup: close time.busy=16.3ms time.idle=11.1µs
thread 'main' panicked at /home/edu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sp1-cuda-4.0.0/src/lib.rs:253:77:
called `Result::unwrap()` on an `Err` value: ReqwestError(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("localhost")), port: Some(3000), path: "/twirp/api.ProverService/Setup", query: None, fragment: None }, source: hyper_util::client::legacy::Error(SendRequest, hyper::Error(IncompleteMessage)) })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Are there any limitations in terms of GPU architecture? I didn't seem to see anything in the documentation about it.

I'm using a a2-highgpu-4g:

$ nvidia-smi 
Mon Feb  3 15:41:03 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:00:04.0 Off |                    0 |
| N/A   31C    P0             53W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          Off |   00000000:00:05.0 Off |                    0 |
| N/A   30C    P0             54W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          Off |   00000000:00:06.0 Off |                    0 |
| N/A   30C    P0             54W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-40GB          Off |   00000000:00:07.0 Off |                    0 |
| N/A   32C    P0             53W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
@JuArce
Copy link

JuArce commented Feb 7, 2025

Hi! I am having the same problem with an A6000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants