Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch fail: cause: AggregateError [ETIMEDOUT]: #2777

Open
guotie opened this issue Feb 18, 2024 · 27 comments · May be fixed by #3738
Open

fetch fail: cause: AggregateError [ETIMEDOUT]: #2777

guotie opened this issue Feb 18, 2024 · 27 comments · May be fixed by #3738
Labels
bug Something isn't working

Comments

@guotie
Copy link

guotie commented Feb 18, 2024

Bug Description

my code is typescript. I run it in two ways:

  1. use bun
  2. compile to js, and use node v21

when use bun run it, no error occurs;
when use node run it, it occurs fetch failed extremely frequent

the request is an apollo graphql request.
proxy is local http proxy:

https_proxy=http://127.0.0.1:7890 http_proxy=http://127.0.0.1:7890 all_proxy=socks5://127.0.0.1:7890

Logs & Screenshots

ApolloError: fetch failed
    at new ApolloError (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/errors/errors.cjs:33:28)
    at /Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/core/core.cjs:2017:78
    at both (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1347:31)
    at /Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1338:72
    at new Promise (<anonymous>)
    at Object.then (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1338:24)
    at Object.error (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/@apollo/client/utilities/utilities.cjs:1349:49)
    at notifySubscription (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/zen-observable/lib/Observable.js:140:18)
    at onNotify (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/zen-observable/lib/Observable.js:179:3)
    at SubscriptionObserver.error (/Users/guotie/guotie/product/ZUGGER/poolapi/node_modules/zen-observable/lib/Observable.js:240:7) {
  graphQLErrors: [],
  protocolErrors: [],
  clientErrors: [],
  networkError: TypeError: fetch failed
      at node:internal/deps/undici/undici:12442:11
      at processTicksAndRejections (node:internal/process/task_queues:95:5)
      at runNextTicks (node:internal/process/task_queues:64:3)
      at listOnTimeout (node:internal/timers:540:9)
      at process.processTimers (node:internal/timers:514:7) {
    cause: AggregateError [ETIMEDOUT]: 
        at internalConnectMultiple (node:net:1116:18)
        at internalConnectMultiple (node:net:1184:5)
        at Timeout.internalConnectMultipleTimeout (node:net:1707:5)
        at listOnTimeout (node:internal/timers:575:11)
        at process.processTimers (node:internal/timers:514:7) {
      code: 'ETIMEDOUT',
      [errors]: [Array]
    }
  },
  extraInfo: undefined
}

Environment

Mac M1 Sonoma 14.2

Nodejs v21

Additional context

@guotie guotie added the bug Something isn't working label Feb 18, 2024
@mcollina
Copy link
Member

Thanks for reporting!

Can you provide steps to reproduce? We often need a reproducible example, e.g. some code that allows someone else to recreate your problem by just copying and pasting it. If it involves more than a couple of different file, create a new repository on GitHub and add a link to that.

@guotie
Copy link
Author

guotie commented Feb 18, 2024

repo is here: https://github.com/guotie/fetch-failed

I think the problem is timeout.

when the network is Ok, it rarely throw fetch failed; when the network is busy, error occurs more frequent.

@smisra3
Copy link

smisra3 commented Mar 14, 2024

Facing a similar issue

@ahmadxgani
Copy link

same issue, anyone know the solution?

@ahmadxgani
Copy link

if the internet connectivity was the issue, then why curl always work and never throw timeout?

@realyukii
Copy link

it seems this error only appear in certain environment or device? it's hard to reproduce but I think the bug really exists

someone already describe the same issue here too #2990

@ahmadxgani
Copy link

or in certain host? I faced this issue when trying to fetch telegram api

@ahmadxgani
Copy link

here's the endpoint that I try to fetch

curl "https://api.telegram.org/bot7003873933:AAFKl0LwWViMJIA34-qjbTh7nZwcNQr2hFs/getFile?file_id=CAACAgEAAxUAAWYT6IXJGTzY4S96PCbyqyO7fBXXAAIJEgACkweVC56njKMcTovTNAQ"

@metcoder95
Copy link
Member

Can you provide an
Minimum Reproducible Example to support you better?

@realyukii
Copy link

I've provided the Minimum Reproducible Example on my Gist, as you suggested.

If you have a strong or reliable internet connection, consider simulating slow connectivity to see if the error replicates. After all, ETIMEDOUT errors are more likely to occur under limited bandwidth conditions.

Interestingly, while fetch sometimes throws this error, curl seems to be able to avoid it in this scenario.

@metcoder95
Copy link
Member

Hmm, this does not seem like an undici error per se but rather a different way of handling connect timeouts.

The errors shown by the example and the roots of the issue mostly to the initial TCP connection (including TLS), meaning that undici timed out before the server could finish the initial connect operation.

You can attempt to extend the overall timeout while creating a custom Agent (See https://undici.nodejs.org/#/docs/api/Client?id=parameter-clientoptions) and test if that solves the timeout issue, which seems related directly to the network conditions.

As well, you can wrap it with the RetryAgent to automatically retry upon this errors.

@realyukii
Copy link

in the docs it says:

bodyTimeout -  Defaults to 300 seconds.
headersTimeout -  Defaults to 300 seconds.

why it close the connection too early if the default was 300 seconds? does the node fetch use undici differently in the internal?

btw, this is unrelated question: why I can't access undici directly (require('undici')) if it was used in fetch implementation of node? is there a way to expose it, so I can use the RetryAgent class without installing undici dependency?

@metcoder95
Copy link
Member

The timeouts you mention are applied directly at http level, meanwhile the timeout I'm referring to, is linked to the TCP handshake; which by default is 10s (lower than the body and headers).

Sadly no, you'll need to install undici to make use of the RetryAgent.

@realyukii
Copy link

Thanks for your assistance! It's confirmed that increasing the connection timeout resolved the issue in my scenario ^^

@KhafraDev
Copy link
Member

I am able to reliably repro it when fetching too many urls at once. nodejs/node-core-utils#810

@su-angel
Copy link

su-angel commented Jul 25, 2024

I've been working on the same problem for 1 day.
Some comments talk about connection, but I don't think that's the case, because I have a very good connection, and it's not a device problem.

Here's the solution:

check your ip dns configuration or clear your dns cache; make sure your router is serving a real dns server, or change your ip dns to target a real dns server.

The observation is that on my online server the code doesn't throw this AggregateError [ETIMEDOUT] error, but on my local machine it always throws this error only when I'm at home on my local network.

Looking further, the main errors thrown are ETIMEDOUT and ENETUNREACH. This usually occurs when the DNS module is unable to resolve the IP address.

node:internal/deps/undici/undici:11754
    Error.captureStackTrace(err, this);
          ^

TypeError: fetch failed
    at node:internal/deps/undici/undici:11754:11
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: AggregateError [ETIMEDOUT]: 
      at internalConnectMultiple (node:net:1114:18)
      at internalConnectMultiple (node:net:1177:5)
      at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
      at listOnTimeout (node:internal/timers:575:11)
      at process.processTimers (node:internal/timers:514:7) {
    code: 'ETIMEDOUT',
    [errors]: [
      Error: connect ETIMEDOUT <IP:PORT>
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '3.125.177.232',
        port: 443
      },
      Error: connect ENETUNREACH <IP6:PORT> - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '64:ff9b::37d:b1e8',
        port: 443
      }
    ]
  }
}

if the request got succesful once, it seems that the ip address in the log error can be the cached ones stored in your machine before.

My hypothesis is that when working under a network or router with a non-dns server, the node's dns module is no longer able to evaluate the ip address and doesn't use the cached one and sends errors.

I'll check this out, but you have the solution.
I changed my dns configuration in /etc/resolv.conf to 8.8.8.8 or 1.1.1.1. and everything's back to normal!

@SukkaW
Copy link
Contributor

SukkaW commented Oct 10, 2024

I have encountered the same error when experimental HTTP/2 support is enabled with massive parallel requests (more than 32 at once, toward about 20 different domains). With HTTP/2 support disabled the error is gone.

@Uzlopak
Copy link
Contributor

Uzlopak commented Oct 10, 2024

Maybe will also solved by #3707 by @metcoder95

@gregonarash
Copy link

@guotie @Uzlopak I have run into the same issue - only to recall that I have already resolved it, but did not remember the code.

#2990

On some networks (like e.g. mine today on a LTE tethered connection in a country far away from my database provider) the default network-family-autoselection-attempt-timeout 250 ms to resolve domain IP might not be enough and results in ETIMEDOUT.

Doubling that time solves all issues for me:
export NODE_OPTIONS="--network-family-autoselection-attempt-timeout=500"

@metcoder95
Copy link
Member

Is still reproducible or it can be closed after #3707 landed?

@gregonarash
Copy link

gregonarash commented Jan 14, 2025

@metcoder95 yes it is:

Image

Tested on "undici": "^7.2.1"

import { request } from "undici";

const { statusCode, headers, trailers, body } = await request("https://airtable.com");

console.log("response received", statusCode);
console.log("headers", headers);

for await (const data of body) {
  console.log("data", data);
}

console.log("trailers", trailers);

To be able to reproduce you would need some combination of:

  • have a really high latency network
  • be physically far away from US?/ DNS server?
  • be trying to fetch URL that resolves to both IPv4 and IPv6

OR ...
cut down the default network family selection time to 25 ms

export NODE_OPTIONS='--network-family-autoselection-attempt-timeout=25'

I believe this is purely related to Node's default being 250ms which is too short on some networks - there was some debate about it being to short or not but it was closed without changes.
nodejs/node#54359

@metcoder95
Copy link
Member

I'd rather say that is not really an undici nor node.js problem as it can be sorted out by extending the timeout for the DNS resolution; might be worth it to document tho

@gregonarash
Copy link

@metcoder95 true. It could be debated if this is the right default in node.js - under the same conditions curl works, fetch does not , but not really a bug.

Either way I expect issues will continue to be opened in upstream repositories like undici , nextjs for fetch fail - AggregateError [ETIMEDOUT]. I replied to couple threads like that to boost visibility. I have also added pull request to add comment about this in the docs, but it seemed to have been removed recently.

@metcoder95
Copy link
Member

Do you have the PR at hand? Why was removed?

If seeking to extend the timeout, I'd suggest opening an issue in Node.js with the reference to the other issues you've opened and the feedback you've got to see where it lands.

For this issue, having the comment added into the documentation about high-latency networks should be ok.

@gregonarash
Copy link

@metcoder95
I thought it was merged, but not sure what happened:
#3738

@metcoder95
Copy link
Member

can you just address the recommendation there?

@gregonarash
Copy link

It is addressed, the PR was also approved by you. Not sure how your merging process works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet