Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quinn stream is 10x slower then TCP/UDP, what could be wrong? #2153

Open
tubzby opened this issue Feb 12, 2025 · 11 comments
Open

quinn stream is 10x slower then TCP/UDP, what could be wrong? #2153

tubzby opened this issue Feb 12, 2025 · 11 comments

Comments

@tubzby
Copy link

tubzby commented Feb 12, 2025

I have a 5G network with a download limit of about 100Mb/s and a server with a public IP, I noticed that the download speed of the QUIC connection is much lower than I expected, just around 4-5 Mb/s, so I made a test:

  1. Create a 100MB file on the server.
  2. Compare the download speed of tokio TCPStream, tokio UDPSocket, quinn SendStream/RecvStream.

The result:

  1. Tokio TCPStream
read: 104788840 bytes in 15 seconds, speed: 6985922 bytes/s
  1. Tokio UDPSocket
read: 104096000 bytes in 13 seconds, speed: 8007384 bytes/s
  1. quinn
read: 103257760 bytes in 186 seconds, speed: 555149 bytes/s

Part of the program:

struct RawQuicServer {
    arg: ServerArg,
}

impl RawQuicServer {
    async fn run(&mut self) -> anyhow::Result<()> {
        println!("listening at: {}", self.arg.port);

        let addr = format!("0.0.0.0:{}", self.arg.port);
        let mut acceptor = MyQAcceptor::new(&addr)?;
        loop {
            let conn = acceptor.accept().await?;
            println!("got connection from: {}", conn.remote_address());
            self.serve(conn).await?;
        }
    }

    async fn serve(&mut self, conn: quinn::Connection) -> anyhow::Result<()> {
        let mut buf = vec![0u8; 1400];

        let (mut tx, _rx) = conn.accept_bi().await?;
        let mut file = File::open(&self.arg.file).await?;
        while let Ok(n) = file.read(&mut buf).await {
            if n == 0 {
                break;
            }
            tx.write(&buf[..n]).await.context("write failed")?;
        }
        println!("finished");
        Ok(())
    }
}

struct RawQuicClient {
    arg: ClientArg,
}

impl RawQuicClient {
    async fn run(&mut self) -> anyhow::Result<()> {
        let server = format!("{}:{}", self.arg.server, self.arg.port);

        let client_cfg = configure_client()?;
        let mut endpoint = quinn::Endpoint::client("0.0.0.0:0".parse()?)?;
        endpoint.set_default_client_config(client_cfg);

        let addr = server.parse()?;
        let conn = endpoint.connect(addr, "localhost")?.await?;

        println!("connected");

        let (mut tx, mut rx) = conn.open_bi().await?;
        tx.write("knock".as_bytes()).await?;

        let mut buf = vec![0u8; 1400];
        let mut read = 0;
        let start = Instant::now();
        let mut last = Instant::now();

        let mut tick = interval(Duration::from_millis(10));
        loop {
            tokio::select! {
                result = rx.read(&mut buf) => {
                    let result = match result {
                        Ok(x) => x,
                        Err(_) => break,
                    };
                    let n = match result {
                        Some(x) => x,
                        None => break,
                    };
                    if n == 0 {
                        println!("read finished");
                        break;
                    }
                    read += n;
                },
                    _ = tick.tick() => {
                    if last.elapsed() >= Duration::from_secs(2) {
                        println!("read: {read} bytes");
                        last = Instant::now();
                    }
                }
            }
        }

        let elapsed = start.elapsed().as_secs();
        if elapsed == 0 {
            panic!("fail too soon");
        }
        println!(
            "read: {} bytes in {} seconds, speed: {}",
            read,
            elapsed,
            read as u64 / elapsed
        );
        Ok(())
    }
}

Tokio TCP/UDP code is more or less the same.

A few notes:

  1. I have added some flow control to the UDP sender to ensure it matches the bandwidth.
  2. The test program is built in release mode.
  3. All the TCP/UDP/QUIC code is in one .rs file.
  4. ipfer3 on client
iperf3 -c SERVER_ADDRESS -i 1 -t 10 -b 90m -u -R
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  10.7 MBytes  90.1 Mbits/sec  0.152 ms  4/8120 (0.049%)
[  5]   1.00-2.00   sec  8.51 MBytes  71.4 Mbits/sec  0.117 ms  0/6431 (0%)
[  5]   2.00-3.00   sec  12.9 MBytes   108 Mbits/sec  0.147 ms  0/9758 (0%)
[  5]   3.00-4.00   sec  10.8 MBytes  90.3 Mbits/sec  0.124 ms  0/8129 (0%)
[  5]   4.00-5.00   sec  10.7 MBytes  89.9 Mbits/sec  0.118 ms  13/8105 (0.16%)
[  5]   5.00-6.00   sec  10.7 MBytes  89.5 Mbits/sec  0.123 ms  0/8062 (0%)
[  5]   6.00-7.00   sec  10.8 MBytes  90.5 Mbits/sec  0.114 ms  0/8149 (0%)
[  5]   7.00-8.00   sec  10.6 MBytes  89.2 Mbits/sec  0.138 ms  60/8097 (0.74%)
[  5]   8.00-9.00   sec  10.7 MBytes  90.0 Mbits/sec  0.134 ms  0/8102 (0%)
[  5]   9.00-10.00  sec  10.7 MBytes  90.0 Mbits/sec  0.118 ms  8/8116 (0.099%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.04  sec   108 MBytes  90.0 Mbits/sec  0.000 ms  0/81410 (0%)  sender
[  5]   0.00-10.00  sec   107 MBytes  89.9 Mbits/sec  0.118 ms  85/81069 (0.1%)  receiver
@tubzby tubzby changed the title quinn stream is 10x slow then TCP/UDP, what could be wrong? quinn stream is 10x slower then TCP/UDP, what could be wrong? Feb 12, 2025
@djc
Copy link
Member

djc commented Feb 12, 2025

Quinn includes encryption via TLS, while TCP/UDP don't.

@tubzby
Copy link
Author

tubzby commented Feb 13, 2025

I tested it on another client (same hardware) with a wired connection, it can reach 6462908 bytes/s, a bit lower than TCP, which is reasonable.

Can a high RTT be the cause? For 5G it's about 100ms compared to 18ms for the wired connection, if so, can I increase quinn snd_buf/recv_buf?

@thomaseizinger
Copy link
Contributor

Can a high RTT be the cause?

In theory, yes. See https://en.wikipedia.org/wiki/Bandwidth-delay_product.

@tubzby
Copy link
Author

tubzby commented Feb 13, 2025

Can a high RTT be the cause?

In theory, yes. See https://en.wikipedia.org/wiki/Bandwidth-delay_product.

Thanks for the link, it seems that TCP can increase window scale, can QUIC do that?

@Ralith
Copy link
Collaborator

Ralith commented Feb 13, 2025

The various window and buffer_size parameters to TransportConfig may be of interest. The default settings are specifically designed to be adequate for 100Mbps at 100ms, so if you do find some settings that work better, please let us know!

Another possibility to explore is packet loss.

@tubzby
Copy link
Author

tubzby commented Feb 13, 2025

How to observe packet loss through quinn?

----EDIT 1----

Thers' ConnectionStats

@tubzby
Copy link
Author

tubzby commented Feb 14, 2025

I have increased the window size and buffer size (both on the server and client), but nothing good:

fn set_transport_config(config: &mut TransportConfig) {
    // STREAM_RWND: 12500 * 100
    // stream_receive_window: STREAM_RWND
    // send_window: 8 * STREAM_RWND
    // crypto_buffer_size: 16 * 1024

    let stream_rwnd = 12500 * 100;
    // increase wnd *2
    let stream_rwnd = stream_rwnd * 2;
    let crypto_buffer_size = 16 * 1024;
    // increase *2
    let crypto_buffer_size = crypto_buffer_size * 2;

    config.stream_receive_window((stream_rwnd as u32).into());
    config.send_window(8 * stream_rwnd);
    config.crypto_buffer_size(crypto_buffer_size);
}

lost packets on the client are quite low (print interval is 2 seconds):

read: 37927414 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 38407326 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 38779318 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 39090255 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 39523313 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 39900975 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 40196299 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 40476012 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 40768502 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 41289600 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 41898718 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 42361589 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 42570305 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 42922423 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 43363415 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 43806485 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 44256564 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 44702395 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 45058779 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 45454929 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 46045595 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 46507047 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 47082105 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 47803392 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 48538892 bytes, lost_packets: 3, lost_plpmtud_probes: 0
read: 49142337 bytes, lost_packets: 3, lost_plpmtud_probes: 0

The test process consumes about 15% of CPU.

@Ralith
Copy link
Collaborator

Ralith commented Feb 14, 2025

You need to configure connection-level window sizes, not just stream-level. The smaller of the two limits applies.

@tubzby
Copy link
Author

tubzby commented Feb 14, 2025

You need to configure connection-level window sizes, not just stream-level. The smaller of the two limits applies.

What do you mean by stream-level? I didn't see any API related to window/buffer settings inside SendStream and RecvStream.

I set TransportConfig to ServerConfig and ClientConfig before the connection was made.

@tubzby
Copy link
Author

tubzby commented Feb 14, 2025

Sorry, I think you mean TransportConfig.receive_window.

Because it is initialized to VarInt::MAX so I ignored that part.

@Ralith
Copy link
Collaborator

Ralith commented Feb 14, 2025

Ah, right, that should be fine then, assuming you're installing the config correctly. Next step would be to investigate what precisely it's waiting for. I don't have the bandwidth to do this personally right now, but you could approach the problem by investigating a decrypted packet capture, and/or digging into quinn_proto::Connection::poll_transmit and poll_timeout to see what's preventing data from being transmitted more often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants