Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example of minimal hermetic gcc toolchain #407

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file.
67 changes: 67 additions & 0 deletions cc-toolchain/01-minimal-gcc-toolchain/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
This is an example of how to set up a hermetic single-platform gcc toolchain using the `cc_toolchain_config` rule provided from [`@bazel_tools//tools/cpp:unix_cc_toolchain_config.bzl`](https://github.com/bazelbuild/bazel/blob/master/tools/cpp/unix_cc_toolchain_config.bzl).

The example contains several pieces:

1. Download `glibc` from [toolchains.bootlin.com](toolchains.bootlin.com) which includes gcc and C standard library (aka `libc`)

2. Write the BUILD file for the glibc repository. The BUILD file

a. A `cc_toolchain_config` target to specify compiler flags, linker flags, a map of gcc tools, and the sysroot. To ensures `gcc` uses the downloaded headers instead of system headers, compiler flags include `-fno-canonical-system-headers`.

b. A `cc_toolchain` target to specify all the inputs that cc actions need in addition to the toolchain config.

c. A `toolchain` target to specify the exec and target configuration, and the corresponding cc toolchain.

## Testing
To test the toolchain, run

```
$ bazel build //main:hello-world
INFO: Analyzed target //main:hello-world (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //main:hello-world up-to-date:
bazel-bin/main/hello-world

$ bazel-bin/main/hello-world
Hello, World!
```

## Debugging

To confirm the build is hermetic, add `-v` to compiler flags and linker flags to ensure the header and library search paths and tools are in the toolchain (i.e. begining with `external/gcc_toolchain`) instead of system.

## Limitation

In Bazel sandbox, the linker fails finding `libc.so.6` since `($SYSROOT)/usr/lib/libc.so` refers to `libc.so.6` as

```
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )
```

This is because linker fails to find resolve `($SYSROOT)/lib64/libc.so.6` correctly in the sandbox. See https://stackoverflow.com/questions/52386530/linker-fails-in-sandbox-when-running-through-bazel-but-works-when-sandboxed-comm for more context. We can confirm this is an issue in sandbox in two ways

1. Running the build with `--spawn_strategy=standalone`
2. Running the link command directly outside of Bazel

There are several ways to fix this:

1. If you have control over `libc.so`, change it to
```
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /lib/x86_64-linux-gnu/libc.so.6 /lib/x86_64-linux-gnu/libc_nonshared.a AS_NEEDED ( /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ) )
```

where the paths exist relative to the sysroot root.

2. Setting symbolic links
```
$ ln -s /lib/x86_64-linux-gnu/libc_nonshared.a /lib/x86_64-linux-gnu/libc_nonshared.a
$ ln -s /lib/x86_64-linux-gnu/libc.so.6 /lib64/libc.so.6
```
169 changes: 169 additions & 0 deletions cc-toolchain/01-minimal-gcc-toolchain/cc_toolchain.BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
load("@bazel_tools//tools/cpp:unix_cc_toolchain_config.bzl", "cc_toolchain_config")

cc_toolchain_config(
name = "cc_toolchain_config",
abi_libc_version = "unknown",
abi_version = "unknown",
builtin_sysroot = "external/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot",
compile_flags = [
"-no-canonical-prefixes",
"-fno-canonical-system-headers",
"-isystem",
"external/gcc_toolchain/x86_64-buildroot-linux-gnu/include/c++/12.3.0",
"-isystem",
"external/gcc_toolchain/x86_64-buildroot-linux-gnu/include/c++/12.3.0/x86_64-buildroot-linux-gnu",
"-isystem",
"external/gcc_toolchain/x86_64-buildroot-linux-gnu/include",
"-isystem",
"external/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot/usr/include",
"-isystem",
"external/gcc_toolchain/lib/gcc/x86_64-buildroot-linux-gnu/12.3.0/include",
],
compiler = "gcc",
cpu = "x86_64",
cxx_builtin_include_directories = [
"%sysroot%/include/c++/12.3.0",
"%sysroot%/include/c++/12.3.0/x86_64-linux",
"%sysroot%/lib/gcc/x86_64-linux/12.3.0/include-fixed",
"%sysroot%/lib/gcc/x86_64-linux/12.3.0/include",
"%sysroot%/usr/include",
],
host_system_name = "local",
link_flags = [
"-Bexternal/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot/bin",
"-Bexternal/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot/usr/lib64",
"-Bexternal/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot/lib64",
"-Lexternal/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot/bin",
"-Lexternal/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot/usr/lib64",
"-Lexternal/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot/lib64",
],
target_libc = "unknown",
target_system_name = "local",
tool_paths = {
"gcc": "bin/x86_64-buildroot-linux-gnu-gcc",
"cpp": "bin/x86_64-buildroot-linux-gnu-cpp",
"ar": "bin/x86_64-buildroot-linux-gnu-ar",
"nm": "bin/x86_64-buildroot-linux-gnu-nm",
"ld": "bin/x86_64-buildroot-linux-gnu-ld",
"as": "bin/x86_64-buildroot-linux-gnu-as",
"objcopy": "bin/x86_64-buildroot-linux-gnu-objcopy",
"objdump": "bin/x86_64-buildroot-linux-gnu-objdump",
"gcov": "bin/x86_64-buildroot-linux-gnu-gcov",
"strip": "bin/x86_64-buildroot-linux-gnu-strip",
"llvm-cov": "/bin/false",
},
toolchain_identifier = "arm_gcc",
)

toolchain(
name = "toolchain",
exec_compatible_with = [
"@platforms//os:linux",
"@platforms//cpu:x86_64",
],
target_compatible_with = [
"@platforms//os:linux",
"@platforms//cpu:x86_64",
],
toolchain = ":cc_toolchain",
toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
)

filegroup(
name = "include",
srcs = glob([
"lib/gcc/x86_64-buildroot-linux-gnu/*/include/**",
"lib/gcc/x86_64-buildroot-linux-gnu/*/include-fixed/**",
"x86_64-buildroot-linux-gnu/include/**",
"x86_64-buildroot-linux-gnu/sysroot/usr/include/**",
"x86_64-buildroot-linux-gnu/include/c++/*/**",
"x86_64-buildroot-linux-gnu/include/c++/*/x86_64-buildroot-linux-gnu/**",
"x86_64-buildroot-linux-gnu/include/c++/*/backward/**",
]),
visibility = ["//visibility:public"],
)

filegroup(
name = "lib",
srcs = glob(
include = [
"lib64/**",
# FIX ME: Even though link action sets --sysroot=external/gcc_toolchain/x86_64-buildroot-linux-gnu/sysroot
# libc.so.6 is not found.
# See https://stackoverflow.com/questions/52386530/linker-fails-in-sandbox-when-running-through-bazel-but-works-when-sandboxed-comm
# Bazel symlinks sandbox messes up the sysroot so that libc.so.6 and libc_nonshared.a can't be found
# even though running the link action outsize of Bazel works.
# A workaround is to run
# ln -s /lib/x86_64-linux-gnu/libc_nonshared.a /usr/lib64/libc_nonshared.a
# ln -s /lib/x86_64-linux-gnu/libc.so.6 /lib64/libc.so.6
# because libc.so expects
# GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )
"x86_64-buildroot-linux-gnu/sysroot/lib64/libc.so.6",
"x86_64-buildroot-linux-gnu/sysroot/usr/lib64/**",
"lib64/gcc/x86_64-buildroot-linux-gnu/12.3.0/**",
],
exclude = [
"lib*/**/*python*/**",
"lib/gawk/**",
],
),
)

filegroup(
name = "compiler_files",
srcs = [
":gcc",
":include",
],
)

filegroup(
name = "linker_files",
srcs = [
":gcc",
":lib",
],
)

filegroup(
name = "all_files",
srcs = [
":compiler_files",
":include",
":linker_files",
],
)

filegroup(
name = "gcc",
srcs = [
"bin/x86_64-buildroot-linux-gnu-cpp",
"bin/x86_64-buildroot-linux-gnu-cpp.br_real",
"bin/x86_64-buildroot-linux-gnu-g++",
"bin/x86_64-buildroot-linux-gnu-g++.br_real",
"bin/x86_64-buildroot-linux-gnu-gcc",
"bin/x86_64-buildroot-linux-gnu-gcc.br_real",
] + glob([
"**/cc1plus",
"**/cc1",
"lib64/libgmp.so*",
"lib64/libmpc.so*",
"lib64/libmpfr.so*",
]),
visibility = ["//visibility:public"],
)

cc_toolchain(
name = "cc_toolchain",
all_files = ":all_files",
ar_files = ":all_files",
as_files = ":all_files",
compiler_files = ":compiler_files",
dwp_files = ":all_files",
dynamic_runtime_lib = ":all_files",
linker_files = ":linker_files",
objcopy_files = ":all_files",
static_runtime_lib = ":all_files",
strip_files = ":all_files",
toolchain_config = ":cc_toolchain_config",
)
14 changes: 14 additions & 0 deletions cc-toolchain/01-minimal-gcc-toolchain/defs.bzl
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

def instal_minimal_gcc_toolchain():
http_archive(
name = "gcc_toolchain",
urls = ["https://toolchains.bootlin.com/downloads/releases/toolchains/x86-64/tarballs/x86-64--glibc--stable-2023.11-1.tar.bz2"],
strip_prefix = "x86-64--glibc--stable-2023.11-1",
sha256 = "e3c0ef1618df3a3100a8a167066e7b19fdd25ee2c4285cf2cfe3ef34f0456867",
build_file = "//01-minimal-gcc-toolchain:cc_toolchain.BUILD",
)

native.register_toolchains(
"@gcc_toolchain//:toolchain",
)
6 changes: 6 additions & 0 deletions cc-toolchain/MODULE.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
###############################################################################
# Bazel now uses Bzlmod by default to manage external dependencies.
# Please consider migrating your external dependencies from WORKSPACE to MODULE.bazel.
#
# For more details, please check https://github.com/bazelbuild/bazel/issues/18958
###############################################################################
Loading