Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: new(ci): use zig compiler instead of relying on centos7. #3307

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

FedeDP
Copy link
Contributor

@FedeDP FedeDP commented Aug 28, 2024

What type of PR is this?

/kind cleanup

Any specific area of the project related to this PR?

/area CI

What this PR does / why we need it:

This PR drops centos7 from our CI by instead relying on zig compiler to provide glibc 2.17 compatible builds for us.

Which issue(s) this PR fixes:

Fixes #3270

Special notes for your reviewer:

Linked libs PRs:

TODO:

Tue Sep  3 10:46:35 2024: Loaded event sources: syscall
Tue Sep  3 10:46:35 2024: Enabled event sources: syscall
Tue Sep  3 10:46:35 2024: Opening 'syscall' source with Kernel module
Trace/breakpoint trap
  • fix test-dev-packages-arm64 -> it seems like a Trace/breakpoint trap is caught related to stats_writer line: output_fields["falco.host_boot_ts"] = machine_info->boot_ts_epoch; -> bca6f31
  • fix test-dev-packages-arm64 -> it seems like a Trace/breakpoint trap is caught related to k8saudit opening
  • libelf is built bundled

Does this PR introduce a user-facing change?:

new(ci): use `zig` compiler instead of relying on centos7.

@poiana
Copy link

poiana commented Aug 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@FedeDP
Copy link
Contributor Author

FedeDP commented Aug 28, 2024

/milestone TBD

@poiana poiana added this to the TBD milestone Aug 28, 2024
@FedeDP
Copy link
Contributor Author

FedeDP commented Aug 28, 2024

So, we need to disable the build with shared libelf, because zig statically links all specified libraries; see: ziglang/zig#7094
Indeed the CI complains it cannot find system libelf.
Since this was an 🌶️ topic for the CNCF, i am going to hold this PR; at the same time, i think we should actually find a way to drop centos:7 while maintaining same glibc requirements.

Anyway, even trying with BUNDLED_LIBELF, gives a build error...

/hold

cmake -B build -S . \
-DCMAKE_BUILD_TYPE=${{ inputs.build_type }} \
-DUSE_BUNDLED_DEPS=On \
-DUSE_BUNDLED_LIBELF=On \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabled bundled libelf to build with zig. Of course this is not desiderable.

@FedeDP FedeDP force-pushed the new/use_zig_drop_centos7 branch 2 times, most recently from e6e593c to 8925c1f Compare August 28, 2024 14:38
@FedeDP
Copy link
Contributor Author

FedeDP commented Aug 28, 2024

Now failing the build on

ld.lld: error: undefined symbol: arc4random_buf

referenced by ares_rand.c:270

That's probably because c-ares does some magic to detect that symbol: https://github.com/c-ares/c-ares/blob/main/CMakeLists.txt#L456; that symbol was added in glibc2.36; most probably it is detecting the symbol even if zig is not actually supporting it since we are targeting a build with glibc-2.17.

@FedeDP
Copy link
Contributor Author

FedeDP commented Aug 29, 2024

I tried to build c-ares from scratch (ie: without Falco involved at all) with zig and it worked fine with the same exports we are doing in the CI.

@FedeDP
Copy link
Contributor Author

FedeDP commented Aug 29, 2024

Update: i saw that updating c-ares to 1.33.1 (latest release) fixed the issue. I am going to bump it in libs (if possible, ie: if grpc does not complain :/ )

Bump PR in libs: falcosecurity/libs#2034

I now bumped this one to use head of the libs PR with the updated c-ares. 🤞

@FedeDP
Copy link
Contributor Author

FedeDP commented Aug 29, 2024

So, i now bumped this branch to use libs chore/zig_build branch HEAD; that is a branch i pushed upstream with all zig related fixes on it:

On that branch, i was able to build a full BUNDLED_DEPS version of sinsp-example!

Moreover, i found out that CMAKE_{C,CXX}_COMPILER truncates CC and CXX strings at first whitespace, leading with cmake using zig as compiler for both, instead of zig cc and zig c++.
I created 2 small wrapper scripts in CI to workaround this issue:

cat > zig-linux-${{ inputs.arch }}-${ZIG_VERSION}/zig-cc << EOF
#!/bin/bash
zig cc -target ${{ inputs.arch }}-linux-gnu.2.17 "$@"
EOF
chmod +x zig-linux-${{ inputs.arch }}-${ZIG_VERSION}/zig-cc
echo "CC=zig-cc" >> $GITHUB_ENV

// same for zig-c++

@FedeDP
Copy link
Contributor Author

FedeDP commented Aug 30, 2024

To be able to build Falco, for now, i had to disable _FORTIFY_SOURCE, see my commit message: 0bb6f49

I opened an upstream issue to track the problem: ziglang/zig#21252 and a PR: ziglang/zig#21253

@FedeDP FedeDP force-pushed the new/use_zig_drop_centos7 branch 4 times, most recently from 903ec35 to 1bf6742 Compare August 30, 2024 14:26
output_fields["falco.host_boot_ts"] = machine_info->boot_ts_epoch;
// This line generates a SIGTRAP in zig debug builds if the casting is removed.
// It seems caused by the pragma pack for the scap_machine_info structure.
output_fields["falco.host_boot_ts"] = (uint64_t)machine_info->boot_ts_epoch;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As weird as it seems, this stupid cast fixed a SIGTRAP generated on debug builds by zig.
If i removed the pragma pack from the scap_machine_info struct, the issue went away. Most probably this is just a bug somewhere in how zig/clang compile the sources.

file(GLOB patches "*.patch")
file(MAKE_DIRECTORY ${PROJECT_BINARY_DIR}/libs-patches/)
foreach(p ${patches})
configure_file("${p}" "${PROJECT_BINARY_DIR}/libs-patches/" @ONLY)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libelf patch uses @CMAKE_SYSTEM_PROCESSOR@ thus it needs to be configured.
@ONLY to avoid touching also normal cmake variables like ${FOO}.

@FedeDP
Copy link
Contributor Author

FedeDP commented Sep 9, 2024

The PR is ready; idea is to merge this right after the Falco release, after falcosecurity/libs#2034, falcosecurity/libs#2037 and falcosecurity/libs#2049 are merged in Falco libs (i'll then take care of update this PR to use libs master of course).

Then, we will have 4 months to discover any new issue and fix it.
Also, as soon as i'll have some spare time, i'd like to benchmark the new binary (eg: by locally building sinsp with zig and running make bench on it, since we now have a bunch of google benchmarks in libs, and doing the same from Falco master).

Finally, leveraging the newly introduced composite action to install zig, i will also add a CI job on libs to test build against zig so that we make sure we will never break the zig build ;)

Improve CI workflow by adding wrapper scripts zig-cc and zig-c++ to be used
as environment vars `CC` and `CXX`, since cmake var `CMAKE_{C,CXX}_COMPILER`
does truncate those variables to their first part (ie: before the first whitespace),
otherwise leading to issues, since only `zig` would be used as compiler by cmake.

Signed-off-by: Federico Di Pierro <[email protected]>
…d bump actions checkout.

Signed-off-by: Federico Di Pierro <[email protected]>
It causes weird issues with zig finding the prototypes for `strlcat` and `strlcpy` in
`lib/libc/include/generic-glibc/bits/string_fortified.h`,
but their implementation is not present, leading to link time build failure.
This means:
* check_symbol_exists(strlcpy "string.h" HAVE_STRLCPY) fails
* `HAVE_STRLCPY` and `HAVE_STRLCAT` are not defined
* `<libscap/strl.h>` defines both `strlcat` and `strlcpy`
* but `string.h` included at the start of `strl.h` will find the prototypes, thus failing because of symbol redefinition

Signed-off-by: Federico Di Pierro <[email protected]>
…g bug is now solved.

Also, switch to a development version of zig with the patch merged.

Signed-off-by: Federico Di Pierro <[email protected]>
…en running against zig.

Signed-off-by: Federico Di Pierro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

CI pipeline uses CentOS 7 which is EOL
2 participants