Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't resend envelopes that get rejected by the server #3938

Merged
merged 5 commits into from
Feb 9, 2025

Conversation

jamescrosswell
Copy link
Collaborator

@jamescrosswell jamescrosswell commented Feb 5, 2025

Resolves #3804:

Windows

Interestingly, on Windows the connection closes gracefully and we get a nice orderly response code back from the server that we can process:

Error: HttpTransport: Sentry rejected the envelope '1c7a93c84b124a8395e982d97670e6a3'. Status code: RequestEntityTooLarge. Error detail: 413 Payload Too Large .

This PR may actually be working around an issue in the .NET Runtime on macOS then.

@jamescrosswell jamescrosswell changed the title Don't resent envelopes that get rejected by the server Don't resend envelopes that get rejected by the server Feb 5, 2025
@jamescrosswell jamescrosswell marked this pull request as ready for review February 6, 2025 00:12
src/Sentry/Internal/Http/CachingTransport.cs Show resolved Hide resolved
src/Sentry/Internal/Http/CachingTransport.cs Outdated Show resolved Hide resolved
catch (Exception ex) when (IsRejectedByServer(ex))
{
_options.ClientReportRecorder.RecordDiscardedEvents(DiscardReason.SendError, envelope);
_options.LogError(ex, "Failed to send cached envelope: {0}. The envelope is likely too big and will be discarded.", file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have a specific "too big" client report? It's possible we'll get that from the server though in which case this is fine.

But when debugging data loss on behalf of customers, when we see "send error" it's often hard to know what to do next.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just heard that Relay doesn't report anything btw. So we must

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't no. According to the docs:

The following discard reasons are currently defined for discarded_events:

  • queue_overflow: a SDK internal queue (eg: transport queue) overflowed
  • cache_overflow: an SDK internal cache (eg: offline event cache) overflowed
  • buffer_overflow: an SDK internal buffer (eg. breadcrumbs buffer) overflowed
  • ratelimit_backoff: the SDK dropped events because an earlier rate limit instructed the SDK to back off.
  • network_error: events were dropped because of network errors and were not retried.
  • sample_rate: an event was dropped because of the configured sample rate.
  • before_send: an event was dropped in before_send
  • event_processor: an event was dropped by an event processor; may also be used for ignored exceptions / errors
  • send_error: an event was dropped because of an error when sending it (eg: 400 response)
  • internal_sdk_error: an event was dropped due to an internal SDK error (eg: web worker crash)
  • insufficient_data: an event was dropped due to a lack of data in the event (eg: not enough samples in a profile)
  • backpressure: an event was dropped due to downsampling caused by the system being under load

I couldn't see anything that matched up neatly with the situation we're experiencing so I used send_error. Possibly buffer_overflow is better here though?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there wasn't anything better, I went with buffer_overflow

{
// TODO: Envelopes could end up in an infinite loop here. We should consider implementing some
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could end up with a lot of items in queue causing new data to not get a chance to be pushed on mobile platforms. Can the queue be sorted to send new data first until a retry/circuit breaker can be added?

Copy link
Collaborator Author

@jamescrosswell jamescrosswell Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Priority queues (e.g. prioritise error reports over traces) is one for the todo list:

@jamescrosswell jamescrosswell merged commit 337faa7 into main Feb 9, 2025
23 checks passed
@jamescrosswell jamescrosswell deleted the broken-pipe branch February 9, 2025 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Envelopes with oversized attachments getting stuck in __processing (Possible regression)
3 participants