Perform a research and implement metric collection POC #14915

alexandr-shegeda · 2022-07-21T11:01:17Z

Tell us about the problem you're trying to solve

We are going to implement some metrics and collect statistics about sync runs.

Describe the solution you’d like

We should analyze existing frameworks and compare them in order to make a decision in favor of one of them to be implemented.

Describe the alternative you’ve considered or used

There are a number of existing tools that allow to collect and display of the benchmarks, such as Datadog, NewRelic, Dynatrace, etc.

Additional context

In the scope of this ticket, we expect to do research and possibly implement some high-level POC.

etsybaev · 2022-08-26T12:26:55Z

Greg Solovyev (Airbyte)
7:40 PM
Here’s what I think we need wrt metrics collection:
Throughput metrics (measured separately for full refresh syncs and incremental syncs, also separately for CDC and non-CDC configurations):
MB/second read for source connectors
MB/second written for destination connectors
Records/second read for source connectors
Records/second written for destination connectors
Records/second processed during normalization phase
MB/second processed during normalization phase (if this is possible to measure)
Scalability metrics:
Minimum Memory required to read XXMb size row
Distributions of throughput metrics listed above measured over different numbers of streams (example: measure MB/second read by MySQL Source Connector with 1 vCPU/500MB Memory when connector has 1 stream, 2 streams…1K streams, measure the same with 2 vCPUs/500MB Memory, measure the same with 1 vCPU/1GB Memory). The goal is to understand if, when and to what extent adding resources to connector containers improves throughput
Performance metrics:
time between source connector startup and first record read
time between destination connector startup and first record written
time elapsed between the moment when source connector sends a record/message to the platform and destination connector receives it

Greg Solovyev (Airbyte)
7:56 PM
You may want to use TPC-DI industry standard benchmark for all test sets and as a guidance for metrics: https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-di_v1.1.0.pdf
https://www.tpc.org/tpcdi/default5.asp

Greg Solovyev (Airbyte)
8:27 PM
Regarding integrating metrics from Airbyte core part: yes, we can use metrics from Airbyte Core via API if that API has the metrics that we need (I don’t know what metrics are available there).
Regarding DataDog: I don’t think we need to use DataDog. The scope of this task is for a Benchmarking tool, not for the entire Airbyte Cloud. We want to run this tool once every few weeks and have it generate a report (a spreadsheet or a data set in our internal data warehouse would be acceptable as an output from this tool). Rather than taking these metrics from real customers in the cloud or real users of OSS platform, we want to use this tool in an isolated environment with pre-defined data sets. The reason for this is that we need to be able to push the platform to the limits and measure the limits of performance and scale, and that we need to be able to repeat the same benchmark run (with the same data set) on different configurations and different versions of our software.

kimerinn · 2022-09-06T10:49:39Z

As a conclusion from conversation with @alexandr-shegeda , I see following subtasks:

Storing simple metrics (sync time) in memory store on server side
Obtaining sync time metrics on client side (e2e-testing-tool)
Storing metrics on client side
Enriching server-side metrics (throughput metrics, scalability metrics, performance metrics)

alexandr-shegeda added type/enhancement New feature or request needs-triage labels Jul 21, 2022

octavia-squidington-iii added the team/triage label Jul 21, 2022

alexandr-shegeda assigned alexandertsukanov Jul 21, 2022

alexandr-shegeda added team/connectors-java and removed needs-triage team/triage labels Jul 21, 2022

alexandr-shegeda assigned etsybaev and unassigned alexandertsukanov Aug 12, 2022

YuliiaNahoha assigned kimerinn Aug 26, 2022

etsybaev removed their assignment Aug 29, 2022

kimerinn linked a pull request Sep 26, 2022 that will close this issue

14915: first prototype - find & print server metrics airbytehq/airbyte-e2e-testing-tool#41

Open

grishick added the from/connector-ops label Sep 27, 2022

grishick removed the team/connectors-java label Oct 7, 2022

evantahler closed this as completed Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perform a research and implement metric collection POC #14915

Perform a research and implement metric collection POC #14915

alexandr-shegeda commented Jul 21, 2022

etsybaev commented Aug 26, 2022

kimerinn commented Sep 6, 2022

Perform a research and implement metric collection POC #14915

Perform a research and implement metric collection POC #14915

Comments

alexandr-shegeda commented Jul 21, 2022

Tell us about the problem you're trying to solve

Describe the solution you’d like

Describe the alternative you’ve considered or used

Additional context

etsybaev commented Aug 26, 2022

kimerinn commented Sep 6, 2022