linux/Documentation
Rong Xu 315ad8780a kbuild: Add AutoFDO support for Clang build
Add the build support for using Clang's AutoFDO. Building the kernel
with AutoFDO does not reduce the optimization level from the
compiler. AutoFDO uses hardware sampling to gather information about
the frequency of execution of different code paths within a binary.
This information is then used to guide the compiler's optimization
decisions, resulting in a more efficient binary. Experiments
showed that the kernel can improve up to 10% in latency.

The support requires a Clang compiler after LLVM 17. This submission
is limited to x86 platforms that support PMU features like LBR on
Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1,
 and BRBE on ARM 1 is part of planned future work.

Here is an example workflow for AutoFDO kernel:

1) Build the kernel on the host machine with LLVM enabled, for example,
       $ make menuconfig LLVM=1
    Turn on AutoFDO build config:
      CONFIG_AUTOFDO_CLANG=y
    With a configuration that has LLVM enabled, use the following
    command:
       scripts/config -e AUTOFDO_CLANG
    After getting the config, build with
      $ make LLVM=1

2) Install the kernel on the test machine.

3) Run the load tests. The '-c' option in perf specifies the sample
   event period. We suggest     using a suitable prime number,
   like 500009, for this purpose.
   For Intel platforms:
      $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \
        -o <perf_file> -- <loadtest>
   For AMD platforms:
      The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2
     For Zen3:
      $ cat proc/cpuinfo | grep " brs"
      For Zen4:
      $ cat proc/cpuinfo | grep amd_lbr_v2
      $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \
        -N -b -c <count> -o <perf_file> -- <loadtest>

4) (Optional) Download the raw perf file to the host machine.

5) To generate an AutoFDO profile, two offline tools are available:
   create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part
   of the AutoFDO project and can be found on GitHub
   (https://github.com/google/autofdo), version v0.30.1 or later. The
   llvm_profgen tool is included in the LLVM compiler itself. It's
   important to note that the version of llvm_profgen doesn't need to
   match the version of Clang. It needs to be the LLVM 19 release or
   later, or from the LLVM trunk.
      $ llvm-profgen --kernel --binary=<vmlinux> --perfdata=<perf_file> \
        -o <profile_file>
   or
      $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
        --format=extbinary --out=<profile_file>

   Note that multiple AutoFDO profile files can be merged into one via:
      $ llvm-profdata merge -o <profile_file>  <profile_1> ... <profile_n>

6) Rebuild the kernel using the AutoFDO profile file with the same config
   as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled):
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file>

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Tested-by: Peter Jung <ptr1337@cachyos.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 22:41:09 +09:00
..
ABI Char/Misc and other driver changes for 6.12-rc1 2024-09-26 10:13:08 -07:00
accel drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
accounting
admin-guide cpufreq: docs: Reflect latency changes in docs 2024-10-21 13:20:03 +02:00
arch arm64 fixes for 6.12-rc2: 2024-10-04 12:20:09 -07:00
block
bpf docs/bpf: Add missing BPF program types to docs 2024-09-12 10:56:41 -07:00
cdrom
core-api arm64 fixes for -rc4 2024-10-17 09:51:03 -07:00
cpu-freq
crypto
dev-tools kbuild: Add AutoFDO support for Clang build 2024-11-06 22:41:09 +09:00
devicetree phy fixes for 6.12 2024-11-03 10:19:34 -10:00
doc-guide
driver-api platform/x86: wmi: Update WMI driver API documentation 2024-10-06 12:48:52 +02:00
fault-injection
fb
features
filesystems vfs-6.12-rc6.fixes 2024-11-01 07:37:10 -10:00
firmware-guide
firmware_class
fpga
gpu Short summary of fixes pull: 2024-10-01 08:15:55 +10:00
hid
hwmon hwmon: Remove devm_hwmon_device_unregister() API function 2024-09-13 07:27:36 -07:00
i2c
iio docs: iio: ad7380: fix supply for ad7380-4 2024-10-24 18:30:47 +01:00
images
infiniband
input
isdn
kbuild kconfig: remove support for "bool" prompt for choice entries 2024-11-04 17:53:09 +09:00
kernel-hacking
leds - Limited LED current based on thermal conditions in the QCOM flash LED driver. 2024-09-23 14:20:11 -07:00
litmus-tests
livepatch
locking
maintainer
mhi
misc-devices
mm Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs 2024-10-17 00:28:09 -07:00
netlabel
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-09-12 17:11:24 -07:00
networking docs: networking: packet_mmap: replace dead links with archive.org links 2024-10-28 15:47:10 -07:00
nvdimm
nvme Remove duplicate "and" in 'Linux NVMe docs. 2024-09-10 15:44:20 -06:00
PCI Documentation: PCI: fix typo in pci.rst 2024-09-10 15:30:42 -06:00
pcmcia
peci
power
process soc: fixes for 6.12 2024-10-17 09:43:36 -07:00
RCU
rust RISC-V: disallow gcc + rust builds 2024-10-25 06:18:37 -07:00
scheduler sched_ext: Documentation: Update instructions for running example schedulers 2024-10-08 08:49:18 -10:00
scsi
security
sound
sphinx
sphinx-static
spi
staging
target
tee
timers
tools
trace
translations move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
usb
userspace-api mseal: update mseal.rst 2024-10-28 21:40:41 -07:00
virt KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL 2024-10-20 07:31:05 -04:00
w1
watchdog [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
wmi platform/x86: dell-ddv: Fix typo in documentation 2024-10-06 12:47:40 +02:00
.gitignore
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py
docutils.conf
dontdiff Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
index.rst
Kconfig
Makefile
memory-barriers.txt docs/memory-barriers.txt: Remove left-over references to "CACHE COHERENCY" 2024-09-13 23:56:44 -07:00
SubmittingPatches
subsystem-apis.rst