Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python virtualenv installs no longer reproducible #16012

Open
3 tasks done
Bo98 opened this issue Sep 18, 2023 · 17 comments
Open
3 tasks done

Python virtualenv installs no longer reproducible #16012

Bo98 opened this issue Sep 18, 2023 · 17 comments
Labels
bug Reproducible Homebrew/brew bug help wanted We want help addressing this python Homebrew/brew's python support

Comments

@Bo98
Copy link
Member

Bo98 commented Sep 18, 2023

brew doctor output

Your system is ready to brew.

Verification

  • My "brew doctor output" above says Your system is ready to brew. and am still able to reproduce my issue.
  • I ran brew update twice and am still able to reproduce my issue.
  • This issue's title and/or description do not reference a single formula e.g. brew install wget. If they do, open an issue at https://github.com/Homebrew/homebrew-core/issues/new/choose instead.

brew config output

(from CI that reproduced the issue)

HOMEBREW_VERSION: 4.1.11-25-gc953076
ORIGIN: https://github.com/Homebrew/brew
HEAD: c9530766096e692d5031750a14b4d1d42f1dc655
Last commit: 7 hours ago
Core tap origin: https://github.com/Homebrew/homebrew-core
Core tap HEAD: e2981ddc6460ff77f8543099ac3046c836ef9828
Core tap last commit: 18 minutes ago
Core tap branch: master
Core tap JSON: 13 Sep 20:15 UTC
HOMEBREW_PREFIX: /usr/local
HOMEBREW_BOOTSNAP: set
HOMEBREW_CACHE: /Users/brew/Library/Caches/Homebrew
HOMEBREW_CASK_OPTS: []
HOMEBREW_COLOR: set
HOMEBREW_CURL_PATH: /usr/bin/curl
HOMEBREW_DEVELOPER: set
HOMEBREW_FAIL_LOG_LINES: 150
HOMEBREW_GIT_EMAIL: [email protected]
HOMEBREW_GIT_NAME: BrewTestBot
HOMEBREW_GIT_PATH: /usr/bin/git
HOMEBREW_GITHUB_API_TOKEN: set
HOMEBREW_LOGS: /Users/brew/actions-runner/_work/homebrew-core/homebrew-core/logs
HOMEBREW_MAKE_JOBS: 6
HOMEBREW_NO_AUTO_UPDATE: set
HOMEBREW_NO_COLOR: set
HOMEBREW_NO_EMOJI: set
HOMEBREW_NO_ENV_HINTS: set
HOMEBREW_NO_INSTALL_FROM_API: set
HOMEBREW_NO_INSTALLED_DEPENDENTS_CHECK: set
HOMEBREW_SORBET_RUNTIME: set
Homebrew Ruby: 2.6.10 => /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/bin/ruby
CPU: hexa-core 64-bit penryn
Clang: 15.0.0 build 1500
Git: 2.39.3 => /usr/bin/git
Curl: 8.1.2 => /usr/bin/curl
macOS: 14.0-x86_64
CLT: 15.0.0.0.1.1694021235
Xcode: 15.0

What were you trying to do (and why)?

Produce a reproducible bottle for a package containing a Python virtualenv.

What happened (include all command output)?

The package is no longer reproducible after a recent change:

$ cat $(brew --prefix asciidoc)/libexec/lib/python3.11/site-packages/asciidoc-10.2.0.dist-info/direct_url.json
{"dir_info": {}, "url": "file:///private/tmp/asciidoc-20230913-6340-1d0mrm3/asciidoc-10.2.0"}

What did you expect to happen?

References to the temp directory are not included.

Deleting or modifying this file is not sufficient on its own as it is hashed in RECORD. There's a few alternative solutions discussed in pypa/pip#11424.

It's possible this issue also extends to regular non-venv std_pip_args installs but I have not tested this.

Step-by-step reproduction instructions (by running brew commands)

(Example)
brew install -s asciidoc
@Bo98 Bo98 added the bug Reproducible Homebrew/brew bug label Sep 18, 2023
@MikeMcQuaid
Copy link
Member

Good catch, thanks @Bo98!

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale No recent activity label Oct 10, 2023
@apainintheneck apainintheneck added help wanted We want help addressing this python Homebrew/brew's python support and removed stale No recent activity labels Oct 10, 2023
@carlocab
Copy link
Member

It's possible this issue also extends to regular non-venv std_pip_args installs but I have not tested this.

I believe this does. CC @branchvincent since I think you may have had a look at this.

@cho-m
Copy link
Member

cho-m commented Jan 12, 2024

Deleting or modifying this file is not sufficient on its own as it is hashed in RECORD. There's a few alternative solutions discussed in pypa/pip#11424.

From quick glance, it looks like most official options for local directory would require splitting up wheel build and wheel install, e.g.

  • python3 -m pip wheel <args> . + python3 -m pip install <args> --no-index --find-links=./some/path <name>
  • python3 -m build --wheel ... + python3 -m installer <wheel>

EDIT: Though I don't know if any quirks with build isolation given most other repository usage is done without isolation using their own packages.

Alternative official option would require removing resources and letting pip fetch tarballs, e.g.

  • python3 -m pip install six==1.16.0

The hack would be to patching up direct_url.json and overwrite hash in RECORD perhaps via a pip shim.

@MikeMcQuaid
Copy link
Member

The hack would be to patching up direct_url.json and overwrite hash in RECORD perhaps via a pip shim.

Something like this would work for me 👍🏻

@cho-m
Copy link
Member

cho-m commented Mar 8, 2024

Trying on asciidoc (Homebrew/homebrew-core#165497), it looks like per platform (e.g. Apple Silicon) the bottle is identical so probably reproducible, but still missing all bottle as it has some recorded sha difference on the bin script:

--- /Users/cho-m/Library/Caches/Homebrew/downloads/0dc87f8a307fb752c8a06de20d603793dcd2d4a2aab76dcbf6085012327a2507--asciidoc--10.2.0.arm64_sonoma.bottle.6.tar.gz
+++ /Users/cho-m/Library/Caches/Homebrew/downloads/581f6cea5fa1a51516e60c91c78ee4cc1545d2e1a8dbd6a6c607cb0bb91cb542--asciidoc--10.2.0.sonoma.bottle.6.tar.gz
│   --- 0dc87f8a307fb752c8a06de20d603793dcd2d4a2aab76dcbf6085012327a2507--asciidoc--10.2.0.arm64_sonoma.bottle.6.tar
├── +++ 581f6cea5fa1a51516e60c91c78ee4cc1545d2e1a8dbd6a6c607cb0bb91cb542--asciidoc--10.2.0.sonoma.bottle.6.tar
│ ├── asciidoc/10.2.0/libexec/lib/python3.12/site-packages/asciidoc-10.2.0.dist-info/RECORD
│ │ @@ -1,9 +1,9 @@
│ │ -../../../bin/a2x,sha256=fUHfht_0WrUAbZS6OYXl2eFJJyRIB7ifIma9OIdKKHQ,248
│ │ -../../../bin/asciidoc,sha256=jM0HujAKip0JkiUIYj8afLlQeIJCUYVUbSS6KFdnU4g,253
│ │ +../../../bin/a2x,sha256=ufhgOjlHOfDxSfLCKkoyteRH4Zz6IXUAyFKDyjSwbBw,245
│ │ +../../../bin/asciidoc,sha256=1ras2JOqlg1UEVWBFctQLiuhwbeLWSLYYdxx1yvzw8A,250

Could be that it is using absolute (i.e. HOMEBREW_PREFIX) path during this metadata generation? EDIT: Probably from shebang.


Also, not a regression, but when Python package builds shared library it stores path into binary. Our old setup_install_args behaves the same way.

Only way around this would be using a stable directory path. I'm guessing we don't do this for /tmp to avoid potential conflict. One option could be source unpack within prefix.

@Bo98
Copy link
Member Author

Bo98 commented Mar 8, 2024

Do we do a @@HOMEBREW_PREFIX@@ in those files? That probably explains it as the sha256 will be before that happens.

Also, not a regression, but when Python package builds shared library it stores path into binary

Will check and see where this happens

@cho-m
Copy link
Member

cho-m commented Mar 8, 2024

Do we do a @@HOMEBREW_PREFIX@@ in those files? That probably explains it as the sha256 will be before that happens.

Yes. Easiest way is probably purging RECORD entries if we want to get an all bottle. Maybe something like:

s.gsub! %r{^(#{Regexp.escape("../../../bin/")}[^/,\n]*),sha256=.*,.*$}, "\\1,,", false

Also, not a regression, but when Python package builds shared library it stores path into binary

Will check and see where this happens

At least I was getting difference locally on back-to-back builds of cffi and running brew bottle --verbose --json --only-json-tab cffi.

I wanted to check if CI behaves the same but our bottle cache skips the 2nd build.

│ ├── cffi/1.16.0_1/lib/python3.12/site-packages/_cffi_backend.cpython-312-darwin.so
│ │ ├── arm64
│ │ │┄ Format-specific differences are supported for this file format but no file-specific differences were detected; falling back to a binary diff. file(1) reports: Mach-O 64-bit arm64 bundle, flags:<NOUNDEFS|DYLDLINK|TWOLEVEL|HAS_TLV_DESCRIPTORS>
│ │ │ @@ -13162,21 +13162,21 @@
│ │ │  00033690: 6765 745f 696e 7465 7270 7374 6174 655f  get_interpstate_
│ │ │  000336a0: 6469 6374 2e61 7474 725f 6e61 6d65 005f  dict.attr_name._
│ │ │  000336b0: 6666 695f 696e 7465 726e 616c 5f6e 6577  ffi_internal_new
│ │ │  000336c0: 2e69 6e74 6572 6e61 6c5f 6f75 7470 7574  .internal_output
│ │ │  000336d0: 005f 5f74 6573 7466 756e 6336 2e79 005f  .__testfunc6.y._
│ │ │  000336e0: 5f4d 6572 6765 6447 6c6f 6261 6c73 2e36  _MergedGlobals.6
│ │ │  000336f0: 3737 002f 7072 6976 6174 652f 746d 702f  77./private/tmp/
│ │ │ -00033700: 6366 6669 2d32 3032 3430 3330 382d 3331  cffi-20240308-31
│ │ │ -00033710: 3933 342d 3937 7668 3475 2f63 6666 692d  934-97vh4u/cffi-
│ │ │ +00033700: 6366 6669 2d32 3032 3430 3330 382d 3333  cffi-20240308-33
│ │ │ +00033710: 3133 392d 776c 3473 3665 2f63 6666 692d  139-wl4s6e/cffi-
...
│ ├── cffi/1.16.0_1/lib/python3.12/site-packages/cffi-1.16.0.dist-info/RECORD
│ │ @@ -1,8 +1,8 @@
│ │ -_cffi_backend.cpython-312-darwin.so,sha256=cY4QovyRC4Z0fRE8WkxBI9Ldf8iWUxXCP4GdsJxZ2Sc,212688
│ │ +_cffi_backend.cpython-312-darwin.so,sha256=GfMRY0xdUvDfgzqy_kNQvayCwPdXXlN3pXGhSyuJjts,212688

I get the same bottle when using libexec, e.g.

@@ -31,9 +31,11 @@ class Cffi < Formula
   end

   def install
+    libexec.install buildpath.children
     pythons.each do |python|
-      system python, "-m", "pip", "install", *std_pip_args, "."
+      system python, "-m", "pip", "install", *std_pip_args, libexec
     end
+    libexec.rmtree
   end

   test do

@Bo98
Copy link
Member Author

Bo98 commented Mar 8, 2024

Ok that .c reference will be a compiler thing. I know how to fix that: it's -ffile-prefix-map=#{buildpath}=/tmp/homebrew-build/cffi. I'll look into integrating this into brew somehow.

It only works on GCC >= 8, LLVM Clang >= 10 and Apple Clang >= 12, though -fdebug-prefix-map will partially cover older versions.

@Bo98
Copy link
Member Author

Bo98 commented Mar 8, 2024

Oh right didn't realise there was two paths. What I said covers the .c one.

@cho-m
Copy link
Member

cho-m commented Mar 8, 2024

Oh right didn't realise there was two paths. What I said covers the .c one.

The other path is just for checksum of first path. Diff should go away once .c path is fixed.

EDIT: Or are you referring to another path embedded in .so itself? Didn't really check full diff there.

@cho-m
Copy link
Member

cho-m commented Mar 8, 2024

I see what you mean. The /private/tmp/cffi-*/cffi-1.16.0/build/temp.macosx-14.0-arm64-cpython-312/src/c/_cffi_backend.o

@Bo98
Copy link
Member Author

Bo98 commented Mar 8, 2024

Please try with #16860.

@Bo98
Copy link
Member Author

Bo98 commented Mar 8, 2024

I don't necessarily expect same checksums across different compiler/linker versions so we won't get an all bottle there, but I would at least expect a repeated build with the same environment to produce the same result.

@cho-m
Copy link
Member

cho-m commented Mar 8, 2024

Please try with #16860.

Works for cffi example.

Do we know if there are any potential scenarios this may need to be disabled? At least Debian has enabled -ffile-prefix-map as default so there is good chance of stability for -fdebug-prefix-map

Ref: https://git.dpkg.org/git/dpkg/dpkg.git/commit/?id=b60c243ba99b8483202a6f6a814476275204fdff


I don't necessarily expect same checksums across different compiler/linker versions so we won't get an all bottle there, but I would at least expect a repeated build with the same environment to produce the same result.

That's what I expect. The all bottle discussion was for pure python code, which should be possible like asciidoc. Though larger virtualenvs probably will have some C/C++ code making all bottles unlikely.

@Bo98
Copy link
Member Author

Bo98 commented Mar 9, 2024

Do we know if there are any potential scenarios this may need to be disabled? At least Debian has enabled -ffile-prefix-map as default so there is good chance of stability for -fdebug-prefix-map

It should be at least safe enough to not worry about it until there's evidence that something breaks.

I don't envision any particular issue

@cho-m
Copy link
Member

cho-m commented Aug 25, 2024

Is there anything left here?

Only thing left I can think of is doing version detection to enable -ffile-prefix-map. There are probably some formulae this will help (e.g. I think csound is impacted, though it is also impacted by a CMake JAR reproducibility issue).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Reproducible Homebrew/brew bug help wanted We want help addressing this python Homebrew/brew's python support
Projects
None yet
Development

No branches or pull requests

5 participants