-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: self-hosted runners for benchmarks #29955
base: main
Are you sure you want to change the base?
Conversation
CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes. |
Builds ready [9c55124]
Page Load Metrics (1730 ± 64 ms)
Bundle size diffs
|
1 similar comment
Builds ready [9c55124]
Page Load Metrics (1730 ± 64 ms)
Bundle size diffs
|
9c55124
to
ce07be7
Compare
with: | ||
name: ${{ inputs.name }} | ||
path: test-artifacts/chrome/benchmark/ | ||
retention-days: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of retention-days
was chosen very arbitrarily, and maybe someone has an opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We pay a very tiny amount per Gigabyte-hour to use artifact storage. I anticipate 5 days will be insignificant even over a large number of artifacts. However, if you have a sense for the size of the artifact + how often it will be generated, we can spot check the cost for each number of retention days
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These particular benchmark artifacts are very very small. It's just some JSON data that gets zipped. On one run, it's 823 bytes and 196 bytes. But we also may want to look back at these numbers much later. GitHub probably isn't the best platform for that though, something like Sentry or Segment is probably much better.
@@ -182,4 +182,4 @@ jobs: | |||
env: | |||
GITHUB_TOKEN: ${{ secrets.LAVAMOAT_UPDATE_TOKEN }} | |||
PR_NUMBER: ${{ github.event.issue.number }} | |||
ACTION_RUN_URL: "${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}" | |||
ACTION_RUN_URL: '${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just Prettier running on a file
ce07be7
to
e5a4333
Compare
Builds ready [e5a4333]
Page Load Metrics (2151 ± 76 ms)
Bundle size diffs
|
e5a4333
to
a6f5acb
Compare
Builds ready [a6f5acb]
Page Load Metrics (1708 ± 72 ms)
Bundle size diffs
|
Could you split this into multiple PRs? A lot of these changes seem to be distinct from each other |
a6f5acb
to
c9f1e1e
Compare
c9f1e1e
to
6266945
Compare
Builds ready [6266945]
Page Load Metrics (1958 ± 284 ms)
Bundle size diffs
|
The migration to GitHub Actions is a necessary step, but before committing to self-hosted runners, we should first validate the stability of measurements across different environments.
The suggested approach is to conduct benchmark runs on GitHub-hosted runners first (ubuntu-latest) and compare them against CircleCI and self-hosted runner results. If fluctuations remain high across all environments, this method may not be suitable for a quality gate. However, the migration to GitHub Actions remains beneficial regardless. Next steps:
Would be great to see some comparison data in the PR before merging. |
As requested by @itsyoboieltr, here are benchmarks running in 4 different configurations:
Probably the most important stat here is uiStartup (btw our old system highlights load). Two additional things I see:
I'm going to run this again to see if it varies over time, and also with some other options.
|
Second run of this, numbers are pretty close to the first run
|
- name: Download artifact prep-build-test-webpack | ||
uses: actions/download-artifact@v4 | ||
with: | ||
path: ./dist/ | ||
pattern: prep-build-test-webpack | ||
merge-multiple: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick/optional: Noting that requiring this artifact means there is a dependency on needing the prep-build-test-webpack
step in main.yml
to successfully create its artifact before this job can work correctly.
I wonder if we could make this more explicit in its file structure doing something like this to make this dependency clearer
E.g
.
├── main.yml
└── main
└── benchmarks.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be cool, but I think GitHub Actions doesn't allow this. If you find documentation that says otherwise, let's go for it.
Another option would be a naming scheme like perhaps main-benchmarks.yml
with: | ||
name: ${{ inputs.name }} | ||
path: test-artifacts/chrome/benchmark/ | ||
retention-days: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We pay a very tiny amount per Gigabyte-hour to use artifact storage. I anticipate 5 days will be insignificant even over a large number of artifacts. However, if you have a sense for the size of the artifact + how often it will be generated, we can spot check the cost for each number of retention days
@@ -61,6 +61,11 @@ class ChromeDriver { | |||
args.push('--disable-gpu'); | |||
} | |||
|
|||
// It will crash if you don't do this, but there might be another way around it | |||
if (process.env.GITHUB_ACTION) { | |||
args.push('--no-sandbox'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have access to the logs that showed up when the crash occurred?
This might be a sign that there is a shift in the permissions that our runner is executing with. The chrome web driver will not run in sandbox mode if its being executed with root permissions as a security measure. Its likely the case that our old runner was not running as root, but now that we are using our self hosted runners (or a custom image), this may have changed. Can you validate this?
I've seen a handful of remote code execution vulnerabilities with webdrivers in the past, so a malicious pull request that triggers this may be able to take advantage of this in the future if we disable the sandbox.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran it again for you, here's the full run log: https://github.com/MetaMask/metamask-extension/actions/runs/13192830006/job/36828840527
Important snippet:
[763:763:0207/035354.645008:FATAL:zygote_host_impl_linux.cc(126)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/main/docs/linux/suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.
[0207/035354.652173:ERROR:file_io_posix.cc(145)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0207/035354.652233:ERROR:file_io_posix.cc(145)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
/__w/metamask-extension/metamask-extension/node_modules/selenium-webdriver/lib/error.js:524
let err = new ctor(data.message)
^
SessionNotCreatedError: session not created: Chrome failed to start: exited normally.
(session not created: DevToolsActivePort file doesn't exist)
(The process started from chrome location /github/home/.cache/selenium/chrome/linux64/1[26](https://github.com/MetaMask/metamask-extension/actions/runs/13192830006/job/36828840527#step:7:27).0.6478.182/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Description
Major changes
runs-on: gha-mm-scale-set-ubuntu-22.04-amd64-med
container: image: cimg/node:22.13-browsers
Xvfb
Prerequisites to merging this PR
benchmarks
branch ofgithub-tools
: https://github.com/MetaMask/github-tools/blob/benchmarks/.github/actions/setup-environment/action.ymlsetup-environment@benchmarks
back tosetup-environment@main
This is just Part 1 of a larger 4-part task to make the startup time quality gate, but I think it's the hardest part
Related issues
Progresses: MetaMask/MetaMask-planning#3679