google/sanitizersPublic

NotificationsYou must be signed in to change notification settings
Fork1.1k
Star12.3k

ThreadSanitizerAboutRaces

Ragh Srinivasan edited this pageApr 28, 2023 ·6 revisions

Introduction

Most programmers know that races are harmful.
For some races it could be quite easy to predict what may go wrong.
For example, the variablevar in the following code may not be equal to2 at the end of the program execution (for more examples refer toThreadSanitizerPopularDataRaces)

int var;
void Thread1() {  // Runs in one thread.
  var++;
}
void Thread2() {  // Runs in another thread.
  var++;
}

However, there are other, much more subtle races which are much less obviously harmful. This article is about such races. Most of the content is specific to C/C++.

TODO(timurrrr) Reference the C++11 standard mentioning data races lead to UB?

TODO(kcc) This article is not finished. Expect more content later.

Racy publication

The race which is most frequently (and mistakenly) perceived as harmless isunsafe publication:

struct Foo {
  int a;
  Foo() { a = 42; }
};

static Foo *foo = NULL;

void Thread1() {  // Create foo.
  foo = new Foo();
}

void Thread2() {  // Consume foo.
  if (foo) {
     assert(foo->a == 42);
  }
}

ThreadSanitizer and most other race detectors will report this as a race onfoo.

Objection 1

But hey, why this race is harmful? Follow my fingers:

Thread1 callsnew Foo(), which allocates memory and callsFoo::Foo()
Thread1 assigns the new value tofoo andfoo becomes non-NULL.
Thread2 checks iffoo != NULL and only then readsfoo->a.
The code is safe!

Clarification 1

On machines with weakmemory model, such as IBM Power or ARM,
foo != NULL does not guarantee that the contents of*foo are ready for reading in Thread2.
For example, the following code doesnot work on machines with weak memory model.

int A = 0, B = 0;

void Thread1() {
  A = 1;
  B = 1;
}

void Thread2() {
  if (B == 1) {
     assert(A == 1);
  }
}

Objection 2

Ok, ok. But I don't care about weak memory model machines. x86 has a strong memory model!

Clarification 2

If you are writing in assembly, yes, this is safe.
But with C/C++ you can not rely on the machine's strong memory model because the compiler may (and will!) rearrange the code for you.
For example, it may easily replaceA=1;B=1; withB=1;A=1;.

Objection 3

Ok, I believe that compilers may do simple reordering, but that doesn't apply to my code!
For example,Foo::Foo() is defined in one.cc file andfoo = new Foo() resides in another.
And even if these were in a single file, what kind of reordering could harm me?

Clarification 3

First, it's a bad idea to rely on the compiler's inability to do some legal transformation.
Second, the modern compilers are much more sophisticated than most programmers think.
For example, many compilers can do cross-file inlining or even inline virtual functions (!).
So, in the example above a compiler may (and often will) change the code to

  foo = (Foo*)malloc(sizeof(Foo));
  new (foo) Foo();

Objection 4

Ok. So I will do something to avoid any bad compiler effect in the thread which createsfoo.
But obviously, the consumer thread's code is safe. What can possibly go wrong with this code on x86?

void Thread2() {  // Consume foo.
  if (foo) {
     // foo is properly published by Thread1().
     assert(foo->a == 42);
  }
}

Clarification 4

Again, you are underestimating the cleverness of modern compilers. How about this (it really happens!)?

void Thread2() {  // Consume foo.
  Foo *t1 = foo;  // reads NULL
  Foo *t2 = foo;  // reads the new value
  if (t2) {
     assert(t1->a == 42);
  }
}

Double-Checked Locking

Double-Checked Locking is broken in C++ because of all the reasons discussedabove.

Racy lazy initialization

One more frequent bug:

// lazy init for Pi.
float *GetPi() {
  static float pi;
  static bool have_pi = false;
  if (!have_pi) {
     pi = ComputePi(); // atomic assignment, no cache coherency issues.
     have_pi = true;
  }
  return pi;
}

Even experienced programmers may assume that this code is correct on a CPU with strong memory model.
But remember that a compiler may rearrangehave_pi = true; andpi = ComputePi(); and one of the threads will return uninitialized value ofpi.

Volatile

DON'T usevolatile for synchronization in C/C++.
The C and C++ standards don't say anything aboutvolatile and threads -- so don't assume anything.

Also, the C++11, chapter intro.multithread, paragraph 21 says:
"The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior."

You don't believe us? Ok, here is a simplest proof that C++ compilers don't treat volatile as synchronization.

% cat volatile.cc
typedef long long T;
volatile T vlt64;
void volatile_store_64(T x) {
        vlt64 = x;
}
% g++ --version
g++ (Ubuntu 4.4.3-4ubuntu5) 4.4.3
...
% g++ -O2 -S volatile.cc -m32 && cat volatile.s
...
_Z17volatile_store_64x:
.LFB0:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        movl    12(%ebp), %edx
        popl    %ebp
        movl    %eax, vlt64
        movl    %edx, vlt64+4
        ret
...

You can clearly see that a 64-bit volatile variable is written in two pieces w/o any synchronization.
Do you still use C++volatile as synchronization?

You may object that we are cheating (with 64-bit ints on a 32 arch).
Ok, we indeed cheated, but just a bit. How about this case?

% cat volatile2.cc
volatile bool volatile_bool_done;
double regular_double;
void volatile_example_2() {
        regular_double = 1.23;
        volatile_bool_done = true;
}
% g++ -O2 volatile2.cc -S
% cat volatile2.s
...
_Z18volatile_example_2v:
.LFB0:
..
        movabsq $4608218246714312622, %rax
        movb    $1, volatile_bool_done(%rip)
        movq    %rax, regular_double(%rip)
        ret
...

Here you can see that the compiler reordered a non-volatile store and a volatile store.

Are you convinced now? If no, please let us know why!

How to debug and fix

If you got a data race report:

analyze the code locations and the stack traces of the racing accesses
understand the object/field that is subject to the race (using the top frame)
understand the context in which the object is accessed (using the whole stack trace)

Note: Location description (heap allocation stack, global variable name, or thread stack)may help to understand the object that is subject to the race.

Note: If the race happens on a standard container (e.g.std::string), then you maysee a number of stack frames inside of the standard library. In such cases find the firstframe that calls into the standard library.

Once you understood that, there are 3 main scenarios:

The object is supposed to be accessed concurrently, but lacks proper internal synchronization.In this case, add internal synchronization to the object. For example, addstd::mutexfield to the class and lock it inside of the methods.
The object is not supposed to be accessed concurrently (for example, this is truefor all standard containers likestd::list,std::vector,std::map, etc)and the calling code (upper up the stack) incorrectly accesses the object concurrently.In this case, change the calling code to not access the object concurrently.This may involve addingstd::mutex in the calling code, waiting for completionof something before accessing the objects, queuing some callbacks sequentially, etc.Generally, this is highly-dependent on the concrete system and cannot be answered withoutunderstanding how the system is supposed to work and how it's actually working.
If you believe the report is a false positive and the accesses do not happen concurrently,find the exact synchronization that is supposed to make them non-concurrent and ensurethat it's indeed in place and is correct, e.g. placed on the right side (before/after)of the exact racing accesses.

Note: try to avoid using raw atomic primitives and memory fences when fixing data races.These primitives arenotoriously hard to use correctly.

Movatterモバイル変換

ThreadSanitizerAboutRaces

Introduction

Racy publication

Objection 1

Clarification 1

Objection 2

Clarification 2

Objection 3

Clarification 3

Objection 4

Clarification 4

Double-Checked Locking

Racy lazy initialization

Volatile

How to debug and fix

More reading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!