feat(relay): expose internal relay metrics #4497

Litarnus · 2025-02-12T13:10:17Z

This PR adds an endpoint that can be used to query internal relay metrics.
At the current state it exposes internally collected memory metrics and a simple up so that we can count the number of instances running.

# Conflicts: # Cargo.lock

# Conflicts: # CHANGELOG.md

Dav1dde

Maybe as suggested we work on this in multiple steps:

Expose the endpoint with basic/mock internal machinery
Expose the easier metrics first (e.g. prevent scale down)
Tackle the harder metrics like utilization of services generically
Then possibly after instrument the special services we have like the processor and store

Dav1dde · 2025-02-12T15:52:19Z

relay-server/src/endpoints/mod.rs

-        .route("/api/relay/events/{event_id}/", get(events::handle));
-    let internal_routes = internal_routes
+        .route("/api/relay/events/{event_id}/", get(events::handle))
+        .route("/api/relay/internal-metrics", get(internal_metrics::handle))


Not sure what we wanna call the route, maybe we want to make it auto scaling specific?

Metrics is already so overloaded.

Fair point, maybe just /keda?

I think we should stick to the more generic autoscaling or similar, since this isn't directly consumed by Keda, but the purpose is auto scaling.

sounds good, I rename it

relay-server/src/services/processor.rs

Litarnus · 2025-02-12T16:34:36Z

@Dav1dde I went with the current approach because it seems easier than expected. I will take a step back and change it to the easier stuff first as you suggested

Dav1dde · 2025-02-14T10:03:45Z

relay-server/src/endpoints/keda.rs

+        }
+    };
+
+    match serde_prometheus::to_string(&data, None, HashMap::new()) {


This adds a bunch of dependencies, maybe we can just serialize it ourselves, the format is quite simple and all of our data is pretty much static.

We can also do that in a follow-up and then parse the output in a test with a proper prometheus parser.

Dav1dde · 2025-02-14T10:05:03Z

tests/integration/test_keda.py

+    relay = relay(mini_sentry)
+    response = relay.get("/api/relay/keda/")
+    assert response.status_code == 200
+    assert "up 1" in response.text


Unrelated note: Wish we had snapshot tests in Python.

Litarnus added 11 commits January 30, 2025 15:32

feat(relay): add endpoint to expose internal resource usage

5c5b1ce

wip

8a73ded

Merge branch 'master' into martinl/resource-management

b1b5557

# Conflicts: # Cargo.lock

multi format support

a9b12b4

wip

bab6ce6

wip

f4ef8fc

Merge branch 'master' into martinl/resource-management

5ee66ee

wip

6e43af6

add metric for utilization

990b5d6

endpoint changes

c85186f

fix tests

49e51eb

Litarnus marked this pull request as ready for review February 12, 2025 15:47

Litarnus requested a review from a team as a code owner February 12, 2025 15:47

Litarnus added 2 commits February 12, 2025 16:53

changelog

d0e3cbe

Merge branch 'master' into martinl/resource-management

67582e2

# Conflicts: # CHANGELOG.md

Dav1dde reviewed Feb 12, 2025

View reviewed changes

Litarnus marked this pull request as draft February 12, 2025 16:41

Litarnus added 2 commits February 13, 2025 09:00

remove processor metric and introduce memory consumption

c2ac3ae

fix compile errors

68c1fcc

Litarnus self-assigned this Feb 13, 2025

Litarnus marked this pull request as ready for review February 13, 2025 08:17

Litarnus added 3 commits February 13, 2025 15:24

add up metric which reports if a server is currently up

95266bc

rename internal_metrics to keda

ae7e917

add test

eedc6fd

Dav1dde reviewed Feb 14, 2025

View reviewed changes

Dav1dde approved these changes Feb 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(relay): expose internal relay metrics #4497

feat(relay): expose internal relay metrics #4497

Litarnus commented Feb 12, 2025 •

edited

Loading

Dav1dde left a comment

Dav1dde Feb 12, 2025

Litarnus Feb 12, 2025

Dav1dde Feb 14, 2025

Litarnus Feb 14, 2025

Litarnus commented Feb 12, 2025

Dav1dde Feb 14, 2025 •

edited

Loading

Dav1dde Feb 14, 2025

feat(relay): expose internal relay metrics #4497

Are you sure you want to change the base?

feat(relay): expose internal relay metrics #4497

Conversation

Litarnus commented Feb 12, 2025 • edited Loading

Dav1dde left a comment

Choose a reason for hiding this comment

Dav1dde Feb 12, 2025

Choose a reason for hiding this comment

Litarnus Feb 12, 2025

Choose a reason for hiding this comment

Dav1dde Feb 14, 2025

Choose a reason for hiding this comment

Litarnus Feb 14, 2025

Choose a reason for hiding this comment

Litarnus commented Feb 12, 2025

Dav1dde Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

Dav1dde Feb 14, 2025

Choose a reason for hiding this comment

Litarnus commented Feb 12, 2025 •

edited

Loading

Dav1dde Feb 14, 2025 •

edited

Loading