Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][perf-training] Support excluding LLVM build from PGO training #126876

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

petrhosek
Copy link
Member

Using LLVM build itself for PGO training is convenient and a great starting point but it also has several issues:

  • LLVM build implicitly depends on tools other than CMake and C/C++ compiler and if those tools aren't available in PATH, the build will fail.
  • LLVM build also requires standard headers and libraries which may not always be available in the default location requiring an explicit sysroot.
  • Building a single configuration (-DCMAKE_BUILD_TYPE=Release) only exercises the -O3 pipeline and can pesimize other configurations.
  • Building for the host target doesn't exercise all other targets.
  • Since LLVMSupport is a static library, this doesn't exercise the linker (beyond what the CMake itself does).

Rather than using LLVM build, ideally we would provide a more minimal, purpose built corpus. While we're working on building such a corpus, provide a CMake option that lets vendors disable the use LLVM build for PGO training.

Using LLVM build itself for PGO training is convenient and a great
starting point but it also has several issues:

* LLVM build implicitly depends on tools other than CMake and C/C++
  compiler and if those tools aren't available in PATH, the build
  will fail.
* LLVM build also requires standard headers and libraries which
  may not always be available in the default location requiring an
  explicit sysroot.
* Building a single configuration (-DCMAKE_BUILD_TYPE=Release) only
  exercises the -O3 pipeline and can pesimize other configurations.
* Building for the host target doesn't exercise all other targets.
* Since LLVMSupport is a static library, this doesn't exercise the
  linker (beyond what the CMake itself does).

Rather than using LLVM build, ideally we would provide a more minimal,
purpose built corpus. While we're working on building such a corpus,
provide a CMake option that lets vendors disable the use LLVM build for
PGO training.
@petrhosek petrhosek added the cmake Build system in general and CMake in particular label Feb 12, 2025
@petrhosek petrhosek requested a review from tstellar February 12, 2025 09:05
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Feb 12, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 12, 2025

@llvm/pr-subscribers-clang

Author: Petr Hosek (petrhosek)

Changes

Using LLVM build itself for PGO training is convenient and a great starting point but it also has several issues:

  • LLVM build implicitly depends on tools other than CMake and C/C++ compiler and if those tools aren't available in PATH, the build will fail.
  • LLVM build also requires standard headers and libraries which may not always be available in the default location requiring an explicit sysroot.
  • Building a single configuration (-DCMAKE_BUILD_TYPE=Release) only exercises the -O3 pipeline and can pesimize other configurations.
  • Building for the host target doesn't exercise all other targets.
  • Since LLVMSupport is a static library, this doesn't exercise the linker (beyond what the CMake itself does).

Rather than using LLVM build, ideally we would provide a more minimal, purpose built corpus. While we're working on building such a corpus, provide a CMake option that lets vendors disable the use LLVM build for PGO training.


Full diff: https://github.com/llvm/llvm-project/pull/126876.diff

3 Files Affected:

  • (modified) clang/utils/perf-training/CMakeLists.txt (+6)
  • (modified) clang/utils/perf-training/lit.cfg (+3)
  • (modified) clang/utils/perf-training/lit.site.cfg.in (+1)
diff --git a/clang/utils/perf-training/CMakeLists.txt b/clang/utils/perf-training/CMakeLists.txt
index 4aed086563ee9..0c1cdd9a1fb60 100644
--- a/clang/utils/perf-training/CMakeLists.txt
+++ b/clang/utils/perf-training/CMakeLists.txt
@@ -6,6 +6,12 @@ set(CLANG_PGO_TRAINING_DATA "${CMAKE_CURRENT_SOURCE_DIR}" CACHE PATH
 set(CLANG_PGO_TRAINING_DATA_SOURCE_DIR OFF CACHE STRING "Path to source directory containing cmake project with source files to use for generating pgo data")
 set(CLANG_PGO_TRAINING_DEPS "" CACHE STRING "Extra dependencies needed to build the PGO training data.")
 
+option(CLANG_PGO_TRAINING_USE_LLVM_BUILD "Use LLVM build for generating PGO data" ON)
+
+llvm_canonicalize_cmake_booleans(
+  CLANG_PGO_TRAINING_USE_LLVM
+)
+
 if(LLVM_BUILD_INSTRUMENTED)
   configure_lit_site_cfg(
     ${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.in
diff --git a/clang/utils/perf-training/lit.cfg b/clang/utils/perf-training/lit.cfg
index adefc7893ac44..3f6089b7139a7 100644
--- a/clang/utils/perf-training/lit.cfg
+++ b/clang/utils/perf-training/lit.cfg
@@ -27,6 +27,9 @@ config.clang = lit.util.which('clang', config.clang_tools_dir).replace('\\', '/'
 config.name = 'Clang Perf Training'
 config.suffixes = ['.c', '.cc', '.cpp', '.m', '.mm', '.cu', '.ll', '.cl', '.s', '.S', '.modulemap', '.test']
 
+if not config.use_llvm_build:
+    config.excludes = ['llvm-support']
+
 cc1_wrapper = '%s %s/perf-helper.py cc1' % (config.python_exe, config.perf_helper_dir)
 
 use_lit_shell = os.environ.get("LIT_USE_INTERNAL_SHELL")
diff --git a/clang/utils/perf-training/lit.site.cfg.in b/clang/utils/perf-training/lit.site.cfg.in
index 9d279d552919a..da81ec21a28a6 100644
--- a/clang/utils/perf-training/lit.site.cfg.in
+++ b/clang/utils/perf-training/lit.site.cfg.in
@@ -11,6 +11,7 @@ config.python_exe = "@Python3_EXECUTABLE@"
 config.cmake_exe = "@CMAKE_COMMAND@"
 config.llvm_src_dir ="@CMAKE_SOURCE_DIR@"
 config.cmake_generator ="@CMAKE_GENERATOR@"
+config.use_llvm_build = @CLANG_PGO_TRAINING_USE_LLVM_BUILD@
 
 # Let the main config do the real work.
 lit_config.load_config(config, "@CLANG_SOURCE_DIR@/utils/perf-training/lit.cfg")

@petrhosek
Copy link
Member Author

@tstellar We're trying to build a more minimal corpus that would address the issues I listed with @ilovepi. In the meantime, it'd be helpful to have this option since the current implementation actually breaks on our builders (which don't have host toolchain and sysroot installed).

Copy link
Collaborator

@tstellar tstellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@tstellar
Copy link
Collaborator

@petrhosek I also wanted to mention, if you are interested in upstreaming this minimal corpus, I would support making it the default instead of the current llvm-support build.

@ilovepi
Copy link
Contributor

ilovepi commented Feb 14, 2025

@tstellar We're still developing the corpus ATM, but contributing it upstream is something we've discussed and would like to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category cmake Build system in general and CMake in particular
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants