You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a cluster is under high query load, users can run into CircuitBreakingException. In case the total circuit breaker trips (indices.breaker.total.limit), users will see certain queries failing, but those are not necessarily the most expensive queries. There can be other (long-running) heavy queries that occupy large portions of memory, but are still below the (total) circuit breaker limits. Users may then start seeing relatively cheap queries failing.
As a user, I want to have an easy way to diagnose what queries are currently occupying (or have recently occopied) what amount of memory that is accountable for the total circuit breaking.
Besides troubleshooting circuit breaker exceptions, such a metric can also help to proactively review expensive queries before running into total circuit breaker exceptions.
Possible Solutions
There could be different solutions.
Use existing sys.operations(_log) tables
There already is a used_bytes column in sys.operations(_log). Would SELECT job_id, SUM(used_bytes) FROM sys.operations be an accurate representation of peak total accountable memory usage per query?
Add a new field to sys.jobs(_log)
If there is no metric exposing the peak accountable memory usage of a query, a new field could be added to sys.jobs(_log).
Considered Alternatives
No response
The text was updated successfully, but these errors were encountered:
Problem Statement
If a cluster is under high query load, users can run into
CircuitBreakingException
. In case the total circuit breaker trips (indices.breaker.total.limit
), users will see certain queries failing, but those are not necessarily the most expensive queries. There can be other (long-running) heavy queries that occupy large portions of memory, but are still below the (total) circuit breaker limits. Users may then start seeing relatively cheap queries failing.As a user, I want to have an easy way to diagnose what queries are currently occupying (or have recently occopied) what amount of memory that is accountable for the total circuit breaking.
Besides troubleshooting circuit breaker exceptions, such a metric can also help to proactively review expensive queries before running into total circuit breaker exceptions.
Possible Solutions
There could be different solutions.
Use existing
sys.operations(_log)
tablesThere already is a
used_bytes
column in sys.operations(_log). WouldSELECT job_id, SUM(used_bytes) FROM sys.operations
be an accurate representation of peak total accountable memory usage per query?If yes, it could be documented in Diagnostics with System Tables.
Add a new field to sys.jobs(_log)
If there is no metric exposing the peak accountable memory usage of a query, a new field could be added to
sys.jobs(_log)
.Considered Alternatives
No response
The text was updated successfully, but these errors were encountered: