Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement OpenTelemetry infrastructure #11255

Open
wants to merge 59 commits into
base: main
Choose a base branch
from

Conversation

JanProvaznik
Copy link
Member

@JanProvaznik JanProvaznik commented Jan 10, 2025

Fixes #10945 #10948

Context

#10560

  • infrastructure part of this

Changes Made

add singleton OpenTelemetryManager

executions in Framework (standalone and managed) trace and collect basic build events with a defined sampling rate.
Sending end of build telemetry comparable to the one sent in sdk.

Testing

Unit tests of the behavior,
monitoring if we get data to the backend

Notes

@dotnet dotnet deleted a comment from azure-pipelines bot Jan 10, 2025
@JanProvaznik JanProvaznik marked this pull request as ready for review January 10, 2025 16:28
@JanProvaznik JanProvaznik requested a review from a team as a code owner January 10, 2025 16:28
/// Create a list of properties sent to VS telemetry with the information whether they should be hashed.
/// </summary>
/// <returns></returns>
public IList<TelemetryItem> GetActivityProperties()
Copy link
Member

@YuliiaKovalova YuliiaKovalova Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion, feel free to ignore:
can we logically split the types of collected data? e.g. AddFeatureFlags, AddBuildProperties, AddTimeMetrics, to improve readability and simplify adding new lines?

It looks like this method has a huge potential to grow :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is that this class is holding basic datapoints that are shared between sdk and VS telemetry which won't change soon, so likely YAGNI.
For more telemetry events with multiple data points there will be different classes holding that data implementing IActivityTelemetryDataHolder. I am open to the opinions that this scheme is wrong in it's entirety, but if not I think refactoring this specific class is overengineering.

src/Framework/Telemetry/BuildTelemetry.cs Outdated Show resolved Hide resolved
telemetryItems.Add(new TelemetryItem("BuildTarget", Target, true));
}

if (Version != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can Version or other strings be empty?

Copy link
Member Author

@JanProvaznik JanProvaznik Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of these strings are sometimes empty, this is copied logic from sdk telemetry, I'm leaning towards that it's out of scope to refactor here

src/Framework/Telemetry/ActivityExtensions.cs Show resolved Hide resolved
src/Framework/Telemetry/ActivityExtensions.cs Outdated Show resolved Hide resolved
src/Framework/Telemetry/ActivityExtensions.cs Outdated Show resolved Hide resolved
@@ -13,6 +13,14 @@
<add key="dotnet8" value="https://dnceng.pkgs.visualstudio.com/public/_packaging/dotnet8/nuget/v3/index.json" />
<add key="dotnet8-transport" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet8-transport/nuget/v3/index.json" />
<add key="dotnet9" value="https://dnceng.pkgs.visualstudio.com/public/_packaging/dotnet9/nuget/v3/index.json" />
<add key="vs-impl" value="https://pkgs.dev.azure.com/azure-public/vside/_packaging/vs-impl/nuget/v3/index.json" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it a temporary thing?
I recall we had some limitations for using this feed in msbuild repo

Copy link
Member Author

@JanProvaznik JanProvaznik Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intended to be permanent. Under it, there is a restriction that this feed can pull only VS OpenTelemetry packages. (and it won't pull them outside Framework because they're conditionally included in Microsoft.Build.Framework.csproj

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rainersigwald to draw your attention here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PSM configuration doesn't exactly say that: it says that Microsoft.VisualStudio.OpenTelemetry* must come from vs-impl, not that vs-impl can only be used for that package.

We need to be careful here--have you talked to the VMR/sourcebuild folks about the dependency?

Copy link
Member Author

@JanProvaznik JanProvaznik Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PSM configuration doesn't exactly say that: it says that Microsoft.VisualStudio.OpenTelemetry* must come from vs-impl, not that vs-impl can only be used for that package.

Thanks, I misunderstood that. Is it possible to do? From a quick look at docs I guess no ☹️

We're in contact and it does not impact sourcebuild, because it's references are gated only for .NET Framework in Microsoft.Build.Framework.csproj, and .NET Framework dependencies are not a part of sourcebuild.

There is no way to condition on that in NuGet.config though.

Copy link
Member Author

@JanProvaznik JanProvaznik Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in dotnet-tools feed, removed vs-impl. cache trolled me

Copy link
Member

@JanKrivanek JanKrivanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I'm not yet reay to sign off - I need a second detailed look, plus there are some notes I put where I'd want your answers first.

src/Framework/Telemetry/ActivityExtensions.cs Outdated Show resolved Hide resolved
src/Framework/Telemetry/OpenTelemetryManager.cs Outdated Show resolved Hide resolved
src/MSBuild/app.config Outdated Show resolved Hide resolved
@@ -296,6 +298,9 @@ string[] args
DumpCounters(false /* log to console */);
}

// Send OpenTelemetry before exiting
OpenTelemetryManager.Instance.Shutdown();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to call this for the API usage case.
Let's try to see if we can have better place to control lifetime of the telemetry manager (probably BuildManager?)

Copy link
Member Author

@JanProvaznik JanProvaznik Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's an oversight given we don't have the collector in VS yet. Now that I look at it, it makes more sense to do everything in EndBuild.

The point is that somewhere the collector has to be shut down for standalone, to reliably send the build data before the app exits. But if someone uses the API and keeps reusing one BuildManager (I think this happens in VS), it would be OK to let the collector and telemetry manager live, it sends data on a timer and it can die when the devenv process dies and we're not really concerned about reliability but about responsiveness.

src/Framework/Telemetry/OpenTelemetryManager.cs Outdated Show resolved Hide resolved
Copy link
Member

@JanKrivanek JanKrivanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!!!

I added couple smaller things for consideration - but none are really blocking

src/Framework/Telemetry/ActivityExtensions.cs Outdated Show resolved Hide resolved
src/Framework/Telemetry/OpenTelemetryManager.cs Outdated Show resolved Hide resolved
documentation/release.md Outdated Show resolved Hide resolved
/// 1:25000 gives us sample size of sufficient confidence with the assumption we collect the order of 1e7 - 1e8 events per day.
/// </summary>
// public const double DefaultSampleRate = 4e-5;
public const double DefaultSampleRate = 1.0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very opinionted suggestion :-) - let's keep this as it was. During testing with A/B experiments can set env vars for both - opt-in and the override rate. Once we exit the VS experimental tasting we won't have risk of forgetting this in and storming tel backend

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the changed PR is that opt in = Override sample rate env var. So this is now a dead variable waiting to be rolled back to when "Default" means something.
I did not want to split these concepts because override without opt in is meaningless and it's not transparent what the sample rate should be when you opt in without specifying an override (I would personally expect it to be 1, not 1:25000).

I'm not concerned it would spam the backend because for enabling the behavior by default we'd have to change MSBuild, not the VS A/B code.

src/Framework/Telemetry/TelemetryItem.cs Outdated Show resolved Hide resolved
src/Framework/Traits.cs Show resolved Hide resolved
… early return in initialize, exception pattern match
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prototype sending data via VS OTel collector
5 participants