-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement OpenTelemetry infrastructure #11255
base: main
Are you sure you want to change the base?
Conversation
/// Create a list of properties sent to VS telemetry with the information whether they should be hashed. | ||
/// </summary> | ||
/// <returns></returns> | ||
public IList<TelemetryItem> GetActivityProperties() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion, feel free to ignore:
can we logically split the types of collected data? e.g. AddFeatureFlags
, AddBuildProperties
, AddTimeMetrics
, to improve readability and simplify adding new lines?
It looks like this method has a huge potential to grow :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea is that this class is holding basic datapoints that are shared between sdk and VS telemetry which won't change soon, so likely YAGNI.
For more telemetry events with multiple data points there will be different classes holding that data implementing IActivityTelemetryDataHolder. I am open to the opinions that this scheme is wrong in it's entirety, but if not I think refactoring this specific class is overengineering.
7c1bae8
to
49467bb
Compare
telemetryItems.Add(new TelemetryItem("BuildTarget", Target, true)); | ||
} | ||
|
||
if (Version != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can Version or other strings be empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of these strings are sometimes empty, this is copied logic from sdk telemetry, I'm leaning towards that it's out of scope to refactor here
@@ -13,6 +13,14 @@ | |||
<add key="dotnet8" value="https://dnceng.pkgs.visualstudio.com/public/_packaging/dotnet8/nuget/v3/index.json" /> | |||
<add key="dotnet8-transport" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet8-transport/nuget/v3/index.json" /> | |||
<add key="dotnet9" value="https://dnceng.pkgs.visualstudio.com/public/_packaging/dotnet9/nuget/v3/index.json" /> | |||
<add key="vs-impl" value="https://pkgs.dev.azure.com/azure-public/vside/_packaging/vs-impl/nuget/v3/index.json" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a temporary thing?
I recall we had some limitations for using this feed in msbuild repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intended to be permanent. Under it, there is a restriction that this feed can pull only VS OpenTelemetry packages. (and it won't pull them outside Framework because they're conditionally included in Microsoft.Build.Framework.csproj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rainersigwald to draw your attention here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PSM configuration doesn't exactly say that: it says that Microsoft.VisualStudio.OpenTelemetry*
must come from vs-impl
, not that vs-impl
can only be used for that package.
We need to be careful here--have you talked to the VMR/sourcebuild folks about the dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PSM configuration doesn't exactly say that: it says that Microsoft.VisualStudio.OpenTelemetry* must come from vs-impl, not that vs-impl can only be used for that package.
Thanks, I misunderstood that. Is it possible to do? From a quick look at docs I guess no
We're in contact and it does not impact sourcebuild, because it's references are gated only for .NET Framework in Microsoft.Build.Framework.csproj, and .NET Framework dependencies are not a part of sourcebuild.
There is no way to condition on that in NuGet.config though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in dotnet-tools feed, removed vs-impl. cache trolled me
62af4f0
to
cdaa52a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
I'm not yet reay to sign off - I need a second detailed look, plus there are some notes I put where I'd want your answers first.
@@ -296,6 +298,9 @@ string[] args | |||
DumpCounters(false /* log to console */); | |||
} | |||
|
|||
// Send OpenTelemetry before exiting | |||
OpenTelemetryManager.Instance.Shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't seem to call this for the API usage case.
Let's try to see if we can have better place to control lifetime of the telemetry manager (probably BuildManager?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's an oversight given we don't have the collector in VS yet. Now that I look at it, it makes more sense to do everything in EndBuild.
The point is that somewhere the collector has to be shut down for standalone, to reliably send the build data before the app exits. But if someone uses the API and keeps reusing one BuildManager (I think this happens in VS), it would be OK to let the collector and telemetry manager live, it sends data on a timer and it can die when the devenv process dies and we're not really concerned about reliability but about responsiveness.
This reverts commit 91643e3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!!!
I added couple smaller things for consideration - but none are really blocking
/// 1:25000 gives us sample size of sufficient confidence with the assumption we collect the order of 1e7 - 1e8 events per day. | ||
/// </summary> | ||
// public const double DefaultSampleRate = 4e-5; | ||
public const double DefaultSampleRate = 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very opinionted suggestion :-) - let's keep this as it was. During testing with A/B experiments can set env vars for both - opt-in and the override rate. Once we exit the VS experimental tasting we won't have risk of forgetting this in and storming tel backend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of the changed PR is that opt in = Override sample rate env var. So this is now a dead variable waiting to be rolled back to when "Default" means something.
I did not want to split these concepts because override without opt in is meaningless and it's not transparent what the sample rate should be when you opt in without specifying an override (I would personally expect it to be 1, not 1:25000).
I'm not concerned it would spam the backend because for enabling the behavior by default we'd have to change MSBuild, not the VS A/B code.
… early return in initialize, exception pattern match
Fixes #10945 #10948
Context
#10560
Changes Made
add singleton OpenTelemetryManager
executions in Framework (standalone and managed) trace and collect basic build events with a defined sampling rate.
Sending end of build telemetry comparable to the one sent in sdk.
Testing
Unit tests of the behavior,
monitoring if we get data to the backend
Notes