Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ThreadSanitizer support #2406

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

devDDen
Copy link

@devDDen devDDen commented Aug 26, 2024

ThreadSanitizer helps to detect data races. The PR adds TSan support to seastar and a build mode where it is enabled. It allows to run seastar applications with ThreadSanitizer.

During testing, the sanitizer reported data races in the following places:

  • already fixed 48c6888
  • exception_hacks.cc in _Unwind_RaiseException (fixed here)
  • in tests/unit/* in BOOST_REQUIRE (fixed here) due to Boost.Test macros are not thread safe. To get an accurate stack trace, the Boost.Test library must be built with ThreadSanitizer

- restore SEASTAR_NO_EXCEPTION_HACK behavior
- detect TSan and its fiber support
- do not handle SIGSEGV, use increased stack size for TSan as for ASan
- use TSan fiber api in context switch code
- add sanitize-thread build mode with TSan and SEASTAR_NO_EXCEPTION_HACK
BOOST_REQUIRE macros are not thread safe, so replace them with
asserts in multithreaded environment
@avikivity
Copy link
Member

Looks good, @tchaikov please review, especially the cmake parts.

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments.

CMAKE_CXX_FLAGS_SANITIZETHREAD
CMAKE_C_FLAGS_SANITIZETHREAD
CMAKE_EXE_LINKER_FLAGS_SANITIZETHREAD
CMAKE_SHARED_LINKER_FLAGS_SANITIZETHREAD)
Copy link
Contributor

@tchaikov tchaikov Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer keeping using the "sanitize" mode and "Sanitize" config. instead of adding yet another one dedicating for building with TSan enabled, can we just add an option named Seastar_SANITIZERS as a list of sanitizers for configuring the build?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov if you're thinking of enabling this sanitizer by default when sanitizers are enabled - please try to measure the performance impact of this sanitizer on some "typical" (however you define that) Seastar test code. The majority of the Seastar code (and code using Seastar) does not use share data between threads, so although catching every bug is great, I doubt this sanitizer will help catch many bugs so we need some justification to turn it on by

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov if you're thinking of enabling this sanitizer by default when sanitizers are enabled

that's not my plan. my plan is to keep the existing behavior.

  • please try to measure the performance impact of this sanitizer on some "typical" (however you define that) Seastar test code. The majority of the Seastar code (and code using Seastar) does not use share data between threads, so although catching every bug is great, I doubt this sanitizer will help catch many bugs so we need some justification to turn it on by


namespace {

#ifdef SEASTAR_HAVE_ASAN_FIBER_SUPPORT
#if defined(SEASTAR_ASAN_ENABLED) && defined(SEASTAR_HAVE_ASAN_FIBER_SUPPORT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because #if at higher level has changed

- #ifdef SEASTAR_ASAN_ENABLED
+ #if defined(SEASTAR_ASAN_ENABLED) || defined(SEASTAR_TSAN_ENABLED)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the comment below about it

SEASTAR_NO_EXCEPTION_HACK)
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
# Disabled because tsan doesn't support std::atomic_thread_fence
list (APPEND Seastar_PRIVATE_CXX_FLAGS -Wno-error=tsan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you quote the link to the related document and/or the bug report? and why is it a private compiler flag?

$<${condition}:Sanitizers::thread>)
target_compile_definitions(seastar
PUBLIC
SEASTAR_NO_EXCEPTION_HACK)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why make it public? AFAICT, it's not part of the public interface, as this macro definition is only checked by reactor.cc.

{
// Do not destroy main context
if (destroy_tsan_fiber)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please attach the { at end of previous line.

Comment on lines +66 to +68
find_package_handle_standard_args (ThreadSanitizer
REQUIRED_VARS
ThreadSanitizer_COMPILER_OPTIONS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, FindSanitizers.cmake should only check only for a package named "Sanitizers", not packages with other name. so i think a better alternative is to support the "COMPONENT" parameter. #2408 was created to address this.

@@ -169,7 +169,7 @@ def identify_best_standard(cpp_standards, compiler):
# For convenience.
tr = seastar_cmake.translate_arg

MODE_TO_CMAKE_BUILD_TYPE = {'release': 'RelWithDebInfo', 'debug': 'Debug', 'dev': 'Dev', 'sanitize': 'Sanitize' }
MODE_TO_CMAKE_BUILD_TYPE = {'release': 'RelWithDebInfo', 'debug': 'Debug', 'dev': 'Dev', 'sanitize': 'Sanitize', 'sanitize-thread' : 'SanitizeThread'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd suggest to add an option to configure.py and cmake to list the enabled sanitizers. by default, if "Sanitize" is specified, both UBSan and ASan are enabled, but one can override the option by specifying it explicitly to use TSan, like

./configure.py --mode sanitize --sanitizer UndefinedBehavior --sanitizer Thread
./configure.py --mode sanitize --sanitizer Address
# -DCMAKE_BUILD_TYPE is but a convenient way to set a bunch of 
# preconfigured settings, but one can fine-tune the build with 
# these lower-level options
# just enable these two sanitizers without any predefined configurations
cmake -DCMAKE_BUILD_TYPE=None -DSeastar_SANITIZERS=Address;UndefinedBehavior
# inherit the settings of Sanitize, but customize the enabled Sanitizers
cmake -DCMAKE_BUILD_TYPE=Sanitize -DSeastar_SANITIZERS=Address;UndefinedBehavior

@@ -71,6 +71,8 @@ module seastar;
#endif

namespace seastar {

#ifndef SEASTAR_NO_EXCEPTION_HACK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restore SEASTAR_NO_EXCEPTION_HACK behavior

i'd suggest extract this into its own commit. and explain why we want to bring it back. and how this change helps.

@@ -963,6 +1023,10 @@ if (Sanitizers_FIBER_SUPPORT)
list (APPEND Seastar_PRIVATE_COMPILE_DEFINITIONS SEASTAR_HAVE_ASAN_FIBER_SUPPORT)
endif ()

if (ThreadSanitizer_FIBER_SUPPORT)
list (APPEND Seastar_PRIVATE_COMPILE_DEFINITIONS SEASTAR_HAVE_TSAN_FIBER_SUPPORT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this definition only applies to ThreadSanitizer, i'd suggest add this definition only if ThreadSanitizer is enabled. so we don't need to check both of them in the C++ code.

@@ -45,13 +52,21 @@ file (READ ${CMAKE_CURRENT_LIST_DIR}/code_tests/Sanitizers_fiber_test.cc _saniti
set (CMAKE_REQUIRED_FLAGS ${Sanitizers_COMPILER_OPTIONS})
check_cxx_source_compiles ("${_sanitizers_fiber_test_code}" Sanitizers_FIBER_SUPPORT)

file (READ ${CMAKE_CURRENT_LIST_DIR}/code_tests/ThreadSanitizer_fiber_test.cc _thread_sanitizer_fiber_test_code)
set (CMAKE_REQUIRED_FLAGS ${ThreadSanitizer_COMPILER_OPTIONS})
check_cxx_source_compiles ("${_thread_sanitizer_fiber_test_code}" ThreadSanitizer_FIBER_SUPPORT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as explained, i'd suggest define this variable only if ThreadSanitizer is listed in the COMPONENT parameter.

@@ -142,7 +142,7 @@ SEASTAR_TEST_CASE(test_map_reduce_lifetime) {
}
auto operator()(const X& x) {
return yield().then([this, &x] {
BOOST_REQUIRE(!destroyed);
assert(!destroyed);
Copy link
Contributor

@tchaikov tchaikov Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BOOST_REQUIRE macros are not thread safe, so replace them with
asserts in multithreaded environment

could you share the impact of "not thread safe" ? did you find the issue by inspecting the code, or by enabling ThreadSanitizer, and it warns at seeing this usage?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. ThreadSanitizer warns
  2. Boost.Test docs say about it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal Seastar tests run on a single reactor thread. But this specific test deliberately starts the same code on multiple threads. And then, supposedly, you're not supposed to use BOOST_REQUIRE from multiple threads.

However, I don't think replacing BOOST_REQUIRE by assert() is the right thing to do - it will replace a clean test failure of a single test function, by a crash of the entire executable... :-(

I think it more sense to have an (atomic-variable) counter of errors, and then BOOST_REQUIRE that the counter be 0 when the do_with_distributed ends (on a single thread).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I find it really shocking - that in 2024 Boost.Test doesn't support multiple threads out of the box. It's trivial to support (especially when performance is not even an issue). It reduces whatever little confidence I had in this test framework :-(

@tchaikov
Copy link
Contributor

@devDDen #2408 has been merged. do you plan to continue working on this change?

@devDDen
Copy link
Author

devDDen commented Sep 15, 2024

@devDDen #2408 has been merged. do you plan to continue working on this change?

Yes, at first I plan to do #2406 (comment) soon in a separate PR

@tchaikov
Copy link
Contributor

@devDDen #2408 has been merged. do you plan to continue working on this change?

Yes, at first I plan to do #2406 (comment) soon in a separate PR

thanks. please let me know if i can help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants