Movatterモバイル変換

Chapter 20: Multi Threading

The 98C++ standard did not acknowledge the existence ofmulti-threading. Between then and the release of the currentC++ standardcomputers have evolved to multi-core machines, and using multi-threading bynow is a real option to consider when developing software.

Multi-threading is an extensive and complex subject, and manygood reference texts on the subject exist. TheC++ multi-threading isbuilt upon the facilities offered by thepthreads library (cf.Nichols, B,et al.'sPthreads Programming, O'Reilly ). However,in line withC++'s current-day philosophy the multi-threadingimplementation offered by the language offers a high level interface tomulti-threading, and using the raw pthread building blocks is hardly evernecessary (cf.Williams, A. (2019):C++ Concurrency in action).

This chapter covers the facilities for multi-threading as supported byC++. Although the coverage aims at providing the tools and examplesallowing you to create your own multi-threaded programs, coverage necessarilyis far from complete. The topic of multi threading is too extensive forthat. The mentioned reference texts provide a good starting point for anyfurther study of multi threading.

Athread of execution (commonly abbreviated to athread) is a singleflow of control within a program. It differs from a separately executedprogram, as created by thefork(1) system call in the sense that threadsall run inside one program, whilefork(1) creates independent copies of arunning program. Multi-threading means that multiple tasks are being executedin parallel inside one program, and no assumptions can be made as to whichthread is running first or last, or at what moment in time. Especially whenthe number of threads does not exceed the number of cores, each thread may beactive at the same time. If the number of threads exceed the number of cores,the operating system will resort totask switching, offering each threadtime slices in which it can perform its tasks. Task switching takes time, andthe law of diminishing returns applies here as well: if the number of threadsgreatly exceeds the number of available cores (also calledoverpopulation), then the overhead incurred may exceed the benefit ofbeing able to run multiple tasks in parallel.

Since all threads are running inside one single program, all threads share theprogram's data and code. When the same data are accessed by multiple threads,and at least one of the threads is modifying these data, access must besynchronized to avoid that threads read data while these data are beingmodified by other threads, and to avoid that multiple threads modify the samedata at the same time.

So how do we run a multi-threaded program inC++? Let's look athelloworld, the multi-threaded way:

     1: #include <iostream>     2: #include <thread>     3:      4: void hello()     5: {     6:     std::cout << "hello world!\n";     7: }     8:      9: int main()    10: {    11:     std::thread hi(hello);    12:     hi.join();    13: }

At line 2 the headerthread is included, informing the compilerabout the existence of the classstd::thread (cf. section20.1.2);
At line 11 thestd::thread hi object is created. It is providedwith the name of a function (hello) which will be called in a separatethread. Actually, the second thread, runninghello, is immediately startedwhen astd::thread is defined this way;
Themain function itself also represents a thread: the program'sfirst thread. It should wait until the second thread has finished. Thisis realized in line 12, wherehi.join() waits until the threadhihas finished its job. Since there are no further statements inmain, theprogram itself ends immediately thereafter.
The functionhello itself, defined in lines 4 through 7, istrivial: it simply inserts the text `hello world' intocout, and terminates, thus ending the second thread.

20.1: Multi Threading

InC++ multi threading may be implemented at various levels of abstraction. Ingeneral the highest level of abstraction which is available to implement amulti-threaded problem should be used. Not so much because it's often simplerthan using lower levels of abstraction, but because higher levels ofabstraction are usually semantically closer to the original problemdescription, resulting in code which is easier to understand and thereforeeasier to maintain. Also, high-abstraction classes also provide exceptionsafety and prevent the occurrence of memory leaks.

C++'s main tool for creating multi-threaded programs is the classstd::thread, and some examples of its use have already been shown at thebeginning of this chapter.

Characteristics of individual threads can be queried from thestd::this_thread namespace. Also,std::this_thread offers some controlover the behavior of an individual thread.

To synchronize access to shared dataC++ offersmutexes (implementedby the classstd::mutex) andcondition variables (implemented by theclassstd::condition_variable).

Members of these classes may throwsystem_error objects (cf. section10.9) when encountering a low-level error condition.

20.1.1: The namespace std::this_thread

The namespace std::this_threadcontains functions that are uniquely associated with the currently runningthread.

Before using the namespacethis_thread the<thread> header file mustbe included.

Inside thestd::this_thread namespace several free functions are defined,providing information about the current thread or that can be used to controlits behavior:

thread::id this_thread::get_id() noexcept:
returns an object of typethread::id that identifies the currentlyactive thread of execution. For an active thread the returnedid is uniquein the sense that it maps 1:1 to the currently active thread, and is notreturned by any other thread. If the thread is currently not running then thedefaultthread::id object is returned by thestd::threadobject'sget_id member.
void yield() noexcept:
when a thread callsthis_thread::yield() the current thread isbriefly suspended, allowing other (waiting) threads to start.
void sleep_for(chrono::duration<Rep, Period> const &relTime) noexcept:
when a thread callsthis_thread::sleep_for(...) it is suspendedfor the amount of time that's specified in its argument. E.g.,
```
std::this_thread::sleep_for(std::chrono::seconds(5));
```
void sleep_until(chrono::time_point<Clock, Duration> const &absTime) noexcept:
when a thread calls this member it is suspended until the specifiedabsTime is in the past. The next example has the same effect as theprevious example:
```
// assume using namespace stdthis_thread::sleep_until(chrono::system_clock().now() + chrono::seconds(5));
```
Conversely, thesleep_until call in the next example immediatelyreturns:
```
this_thread::sleep_until(chrono::system_clock().now() - chrono::seconds(5));
```

20.1.2: The class std::thread

Multi threading inC++ starts off with objects of the classstd::thread. Each object of this class handles a separatethread.

Before usingThread objects the<thread> header file must be included.

Thread objects can be constructed in various ways:

thread() noexcept:
The default constructor creates athread object. As it receives nofunction to execute, it does not start a separate thread of execution. It isused, e.g., as a data member of a class, allowing class objects to start aseparate thread at some later point in time;
thread(thread &&tmp) noexcept:
The move constructor takes ownership of the thread controlled bytmp, whiletmp, if it runs a thread, loses control over itsthread. Following this,tmp is in its default state, and the newly createdthread is responsible for calling, e.g.,join.

explicit thread(Fun &&fun, Args &&...args):
Thismember template (cf. section22.1.3) expects a function(or functor) as its first argument. The function is immediately started as aseparate thread. If the function (or functor) expects arguments, then thesearguments can be passed to thethread's constructor immediately followingits first (function) argument. Additional arguments are passed with theirproper types and values tofun. Following thethread object'sconstruction, a separately running thread of execution is started.

The notationArg &&...args indicates that any additional arguments arepassed as is to the function. The types of the arguments that are passed tothethread constructor and that are expected by the called function mustmatch: values must be values, references must be reference, r-value referencesmust be r-value references (or move construction must be supported). Thefollowing example illustrates this requirement:

     1: #include <iostream>     2: #include <thread>     3:      4: using namespace std;     5:      6: struct NoMove     7: {     8:     NoMove() = default;     9:     NoMove(NoMove &&tmp) = delete;    10: };    11:     12: struct MoveOK    13: {    14:     int d_value = 10;    15:     16:     MoveOK() = default;    17:     MoveOK(MoveOK const &) = default;    18:     19:     MoveOK(MoveOK &&tmp)    20:     {    21:         d_value = 0;    22:         cout << "MoveOK move cons.\n";    23:     }    24: };    25:     26: void valueArg(int value)    27: {}    28: void refArg(int &ref)    29: {}    30: void r_refArg(int &&tmp)    31: {    32:     tmp = 100;    33: }    34: void r_refNoMove(NoMove &&tmp)    35: {}    36: void r_refMoveOK(MoveOK &&tmp)    37: {}    38:     39: int main()    40: {    41:     int value = 0;    42:     43:     std::thread(valueArg,   value).join();    44:     std::thread(refArg,     ref(value)).join();    45:     std::thread(r_refArg,   move(value)).join();    46:     47: //  std::thread(refArg,     value);    48:     49:     std::thread(r_refArg,   value).join();    50:     cout << "value after r_refArg: " << value << '\n';    51:     52: //  std::thread(r_refNoMove, NoMove());    53:     54:     NoMove noMove;    55: //  std::thread(r_refNoMove, noMove).join();    56:     57:     MoveOK moveOK;    58:     std::thread(r_refMoveOK, moveOK).join();    59:     cout << moveOK.d_value << '\n';    60: }

At lines 43 through 45 we see a value, reference, and and r-valuereference being passed to astd::thread: with the functions running the threads expecting matching argument types.
Line 47 fails to compile, as a value argument doesn't match thereference expected byrefArg. Note that this problem was solved in line 43by using thestd::ref function.
On the other hand lines 49 and 58 compile OK, asint values andclass-types supporting move operations can be passed as values to functionsexpecting r-value references. In this case notice that the functions expectingthe r-value references do not access the provided arguments (except for theactions performed by their move constructors), but use move construction tocreate temporary values or objects on which the functions operate.
Lines 52 and 55 won't compile as theNoMove struct doesn't offera move constructor.

Member functions of classes can also be used as thread functions. Inthose cases the constructor's first argument must be the address of the memberfunction, the second argument must be a pointer (or reference, or object) forwhich the member function is called as thread function, while subsequentarguments are passed as arguments to the member function. Here is a simpleexample illustrating these options:

struct Demo{    int d_value = 0;    void fun(int value)    {        d_value = value;        cout << "fun sets value to " << value << "\n";    }};int main(){    Demo demo;    thread thr{&Demo::fun, ref(demo), 12 };    thr.join();    cout << "demo's value: " << demo.d_value << '\n';   // 12    thr = thread{&Demo::fun, &demo, 42 };    thr.join();    cout << "demo's value: " << demo.d_value << '\n';   // 42    thr = thread{&Demo::fun, demo, 77 };    thr.join();    cout << "demo's value: " << demo.d_value << '\n';   // 42: the thread                                                        // copied demo}

Be careful when passing local variables as arguments to thread objects: ifthe thread continues to run when the function whose local variables are usedterminates, then the thread suddenly uses wild pointers or wild references, asthe local variables no longer exist. To prevent this from happening(illustrated by the next example) do as follows:

pass an anonymous copy of the local variable as argument to thethread constructor, or
calljoin on the thread object to ensure that the thread hasfinished within the local variable's lifetime.

     1: #include <iostream>     2: #include <thread>     3: #include <string>     4: #include <chrono>     5:      6: void threadFun(std::string const &text)     7: {     8:     for (size_t iter = 1; iter != 6; ++iter)     9:     {    10:         std::cout << text << '\n';    11:         std::this_thread::sleep_for(std::chrono::seconds(1));    12:     }    13: }    14:     15: std::thread safeLocal()    16: {    17:     std::string text = "hello world";    18:     return std::thread(threadFun, std::string{ text });    19: }    20:     21: int main()    22: {    23:     std::thread local(safeLocal());    24:     local.join();    25:     std::cout << "safeLocal has ended\n";    26: }

In line 18 be sure not to callstd::ref(text) instead ofstd::string{ text }.

If the thread cannot be created astd::system_error exception isthrown.

Since this constructor not only accepts functions but also function objects asits first argument, alocal context may be passed to the functionobject's constructor. Here is an example of a thread receiving a functionobject using a local context:

    #include <iostream>    #include <thread>    #include <array>    using namespace std;    class Functor    {        array<int, 30> &d_data;        int d_value;        public:            Functor(array<int, 30> &data, int value)            :                d_data(data),                d_value(value)            {}            void operator()(ostream &out)            {                for (auto &value: d_data)                {                    value = d_value++;                    out << value << ' ';                }                out << '\n';            }    };    int main()    {        array<int, 30> data;        Functor functor{ data, 5 };        thread funThread{ functor, ref(cout) };        funThread.join();    };

The classstd::thread does not provide a copy constructor.

The following members are available:

thread &operator=(thread &&tmp) noexcept:
If the operator's left-hand side operand (lhs) is a joinablethread, thenterminate is called. Otherwise,tmp is assigned to theoperator's lhs andtmp's state is changed to the thread's default state(i.e.,thread()).

void detach():
Requiresjoinable (see below) to returntrue. The thread forwhichdetach is called continues to run. The (e.g., parent) thread callingdetach continues immediately beyond thedetach-call. After callingobject.detach(), `object' no longer represents the (possibly stillcontinuing but now detached) thread of execution. It is the detached thread'simplementation's responsibility to release its resources when its executionends.

Sincedetach disconnects a thread from the running program, e.g.,main no longer can wait for the thread's completion. As a programends whenmain ends, its still running detached threads also stop, anda program may not properly finish all its threads, as demonstrated bythe following example:

    #include <thread>    #include <iostream>    #include <chrono>    void fun(size_t count, char const *txt)    {        for (; count--; )        {            std::this_thread::sleep_for(std::chrono::milliseconds(100));            std::cout << count << ": " << txt << std::endl;        }    }    int main()    {        std::thread first(fun, 5, "hello world");        first.detach();        std::thread second(fun, 5, "a second thread");        second.detach();        std::this_thread::sleep_for(std::chrono::milliseconds(400));        std::cout << "leaving" << std::endl;    }

A detached thread may very well continue to run after the function thatlaunched it has finished. Here, too, you should be very careful not to passlocal variables to the detached thread, as their references or pointers willbe undefined once the function defining the local variables terminates:

#include <iostream>#include <thread>#include <chrono>using namespace std;using namespace chrono;void add(int const &p1, int const &p2){    this_thread::sleep_for(milliseconds(200));    cerr << p1 << " + " << p2 << " = " << (p1 + p2) << '\n';}void run(){    int v1 = 10;    int v2 = 20;//  thread(add, ref(v1), ref(v2)).detach();     // DON'T DO THIS    thread(add, int(v1), int(v2)).detach();     // this is OK: own copies}int main(){    run();    this_thread::sleep_for(seconds(1));}

id get_id() const noexcept:
If the current object does not represent a running threadthread::id() is returned. Otherwise, the thread's unique ID(also obtainable from within the thread viathis_thread::get_id()) isreturned.
unsigned thread::hardware_concurrency() noexecpt:
This static member returns the number of threads that can run at thesame time on the current computer. On a stand-alone multi-core computer it(probably) returns the number of cores.
void join():
Requiresjoinable to returntrue. If the thread for whichjoin is called hasn't finished yet then the thread callingjoin willbe suspended (also calledblocked) until the thread for whichjoin iscalled has completed. Following its completion the object whosejoinmember was called no longer represents a running thread, and itsget_idmember will returnstd::thread::id().
This member was used in several examples shown so far. As noted: whenmain ends while a joinable thread is still running,terminate iscalled, aborting the program.
bool joinable() const noexcept:
returnsobject.get_id() != id(), whereobject is thethread object for whichjoinable was called.
native_handle_type native_handle():
returns the thread'shandle (implementation defined). This handlecan be passed to functions likepthread_getschedparam andpthread_setschedparam to get/set the thread's scheduling policy andparameters.
void swap(thread &other) noexcept:
The states of thethread object for whichswap was called andother are swapped. Note that threads may always be swapped, even whentheir thread functions are currently being executed.

Things to note:

When intending to define an anonymous thread it may appear not tostart, unless you immediately also calljoin. E.g.,
```
void doSomething();int main(){    thread(doSomething);        // nothing happens??    thread(doSomething).join(); // doSomething is executed??}
```
This similar to the situation we encountered in section7.5:the first statement doesn't define an anonymousthread object at all. Itsimply defines thethread objectdoSomething. Consequently,compilation of the second statement fails, as there is nothread(thread &)constructor. When the first statement is omitted, thedoSomething functionis executed by the second statement. If the second statement is omitted, adefault constructedthread object by the name ofdoSomething isdefined.
A thread only starts after its construction has completed. Thisincludes move constructions or move assignments. E.g., in a statement like
```
    thread object(thread(doSomething));
```
the move constructor is used to transfer control from an anonymous threadexecutingdoSomething to the threadobject. Only afterobject'sconstruction has completeddoSomething is started in the separate thread.
Exceptions thrown from the thread (e.g., by the function defining thethread's actions) are local to the executed thread. Either they must be caughtby the executing thread (as each running thread has its own execution stack),or they can be passed to the starting thread using apackaged_task and afuture (cf., respectively, sections20.11 and20.8).

A thread ends when the function executing a thread finishes. When athread object is destroyed while its thread function is still running,terminate is called, aborting the program's end. Bad news: the destructorsof existing objects aren't called and exceptions that are thrown are leftuncaught. This happens in the following program as the thread is still activewhenmain ends:

    #include <iostream>    #include <thread>    void hello()    {        while (true)            std::cout << "hello world!\n";    }    int main()    {        std::thread hi(hello);    }

There are several ways to solve this problem. One of them is discussed inthe next section.

20.1.2.1: Static data and threads: thread_local

With multi-threaded programs the well-known distinction between global andlocal data is somewhat too coarse. For single- and multi-threaded programsalike, global data are available to all of the program's code, and local dataare available to the function (or compound statement) in which the local dataare defined. But multi-threaded programs may feel the need for an intermediatetype of data, uniquely available to the different threads.

Thethread_local keyword provides this intermediate data level. Globalvariables declared asthread_local are global within each individualthread. Each thread owns a copy of thethread_local variables, and maymodify them at will. Athread_local variable in one thread is completelyseparated from that variable in another thread. Here is an example:

     1: #include <iostream>     2: #include <thread>     3:      4: using namespace std;     5:      6: thread_local int t_value = 100;     7:      8: void modify(char const *label, int newValue)     9: {    10:     cout << label << " before: " << t_value << ". Address: " <<    11:                                                     &t_value << '\n';    12:     t_value = newValue;    13:     cout << label << " after: " << t_value << '\n';    14: }    15:     16: int main()    17: {    18:     thread(modify, "first", 50).join();    19:     thread(modify, "second", 20).join();    20:     modify("main", 0);    21: }

At line 6 thethread_local variablet_value is defined. It isinitialized to 100, and that becomes the initial value for each separatelyrunning thread;
In lines 8 through 14 the functionmodify is defined. It assignsa new value tot_value;
At lines 18 and 19 two threads are started, which are immediatelyjoining the main thread again.
The main thread itself is also a thread, and it directly callsmodify.

Running this program shows that each separate thread starts witht_value being 100, and then modifies it without affecting the values oft_value used by other threads.

Note that, although thet_value variables are unique to each thread,identical addresses may be shown for them. Since each thread uses its ownstack, these variables may occupy the same relative locations within theirrespective stacks, giving the illusion that their physical addresses areidentical.

20.1.2.2: Exceptions and join()

Once a thread starts and it isn't detached it must eventually joinits starting (parent) thread, or the program aborts. Usually, once a threadhas started the parent thread continues to do some work by itself:

    void childActions();    void doSomeWork();    void parent()    {        thread child(childActions);        doSomeWork();        child.join();    }

However, maybedoSomeWork can't complete its work, and throws anexception, to be caught outside ofparent. This, unfortunately, endsparent, andchild.join() is missed. Consequently, the program abortsbecause of a thread that hasn't been joined.

Clearly, all exceptions must be caught,join must be called, and theexception must be rethrown. Butparent cannot use a function try-block, asthe thread object is already out of scope once execution reaches the matchingcatch-clause. So we get:

    void childActions();    void doSomeWork();    void parent()    {        thread child(childActions);        try        {            doSomeWork();            child.join();        }        catch (...)        {            child.join();            throw;        }    }

This is ugly: suddenly the function's code is clobbered with atry-catch clause, as well as some unwelcome code-duplication.

This situation can be avoided using object based programming. Like, e.g.,unique pointers, which use their destructors to encapsulate the destruction ofdynamically allocated memory, we can use a comparable technique to encapsulatethread joining in an object's destructor.

By defining thethread object inside a class we're sure that by the timethe our object goes out of scope, even if thechildActions functionthrows an exception, the thread'sjoin member is called. Here are the bareessentials of ourJoinGuard class, providing the join-guarantee (usingin-line member implementations for brevity):

     1: #include <thread>     2:      3: class JoinGuard     4: {     5:     std::thread d_thread;     6:      7:     public:     8:         JoinGuard(std::thread &&threadObj)     9:         :    10:             d_thread(std::move(threadObj))    11:         {}    12:         ~JoinGuard()    13:         {    14:             if (d_thread.joinable())    15:                 d_thread.join();    16:         }    17: };

At line 8 its only constructor starts: it receives a temporarythread object, which is moved, in line 10, toJoinGuard's d_threaddata member.
When theJoinGuard object ceases to exist, its destructor (line12) makes sure the thread is joined if it's still joinable (lines 14 and 15).

Here is an example howJoinGuard could be used:

     1: #include <iostream>     2: #include "joinguard.h"     3:      4: void childActions();     5:      6: void doSomeWork()     7: {     8:     throw std::runtime_error("doSomeWork throws");     9: }    10:     11: void parent()    12: {    13:     JoinGuard{std::thread{childActions}};    14:     doSomeWork();    15: }    16:     17: int main()    18: try    19: {    20:     parent();    21: }    22: catch (std::exception const &exc)    23: {    24:     std::cout << exc.what() << '\n';    25: }

At line 4childActions is declared. Its implementation (notprovided here) defines the child thread's actions.
Themain function (lines 17 through 25) provides the function try-block to catch the exception thrown byparent;
Theparent function defines (line 13) an anonymousJoinGuard,receiving an anonymousthread object. Anonymous objects are used, as theparent function doesn't need to access them anymore.
In line 14doSomeWork is called, which throws an exception. Thisendsparent, but just before thatJoinGuard's destructor makes surethat the child-thread has been joined.

20.1.3: The class std::jthread

In addition tostd::thread the classstd::jthread can beused.

Before usingjthread objects the<thread> header file must beincluded.

Objects of the classjthread act likethread objects, but ajthread thread automatically joins the thread that activatedjthread. Moreover, in some situationsjthread threads can directly be ended.

Once ajthread object receiving a function defining the thread's actionshas been constructed that function immediately starts as a separate thread.If that function ends by returning a value then that value is ignored. If thefunction throws an exception the program ends by callingstd::terminate. Alternatively, if the function should communicate a returnvalue or an exception to, e.g., the function starting thejthread astd::promise (cf. section (20.12)) can be used or it can modifyvariables which are shared with other threads (see also sections20.2and20.5).

The classjthread offers these constructors:

jthread() noexcept:
The default constructor creates ajthread object that doesn'tstart a thread. It could be used as a data member of a class, allowing classobjects to start thejthread at some later point in time;
explicit jthread(Function &&function, Args &&...args):
This constructor (which is amember template, cf. section22.1.3) expects a function (or functor) as its first argument, startingthe thread defined byfunction. The function receives as its firstargument the return value ofjthread's memberget_stop_token (seebelow), followed by theargs parameters (if present). Iffunction'sfirst argument is not astd::stop_token thenfunction, merelyreceiving theargs parameter values as its arguments. Arguments are passedtofunction with their proper types and values (see the example shownbelow at the description of thejthread memberrequest_stop;)
The classjthread supports move construction and move assignment,but does not offer copy construction and copy assignment.

The following members are available and operate like the identically namedstd::thread members. Refer to section20.1.2 for their descriptions:

void detach();
id get_id() const noexcept;
unsigned thread::hardware_concurrency() noexecpt
void join();
bool joinable() const noexcept;
native_handle_type native_handle();
void swap(thread &other) noexcept.

The following members are specific tojthread, allowing other threads toend the thread started byjthread:

std::stop_source get_stop_source() noexcept:
returns thejthread's std::stop_source.
std::get_stop_token get_stop_token() const noexcept:
returns thejthread's std::stop_token.
bool request_stop() noexcept:
attempts to end the thread started by thejthread object. The function operates atomically: it can be called from multiple threads without causing race conditions. It returnstrue if the stop request was successfully issued. It returnsfalse if a stop request has already been issued, which may also happen ifrequest_stop was issued by different threads, and another thread is still in the process of endingjthread's thread.
When issuingrequest_stop thenstd::stop_callback functions (see the next section) that were registered for the thread's stop state are synchroneously called. If those callback functions throw exceptions thenstd::terminate is called. Also, any waiting condition variables that are associated with thejthread's stop state end their waiting states.

Here is a short program illustratingrequest_stop:

     1: #include <iostream>     2: #include <thread>     3: #include <chrono>     4: using namespace std;     5:      6: void fun(std::stop_token stop)     7: {     8:     while (not stop.stop_requested())     9:     {    10:         cout << "next\n";    11:         this_thread::sleep_for(1s);    12:     }    13: }    14:     15: int main()    16: {    17:     jthread thr(fun);    18:     19:     this_thread::sleep_for(3s);    20:     21:     thr.request_stop();    22:     23:     // thr.join() not required.    24: }

at line 17 thejthread thread starts, receiving functionfun as its argument;
asfun defines astd::stop_token parameter,jthread will start that function. It performs (line 8) awhile loop that continues untilstop's stop_requested returnstrue. The loop itself shows a brief output line (line 10) followed by a one-second sleep (line 11);
themain function, having started the thread, sleeps for three seconds (line 19), and then (line 21) issues a stop-request, ending the thread.

When running the program three lines containingnext are displayed.

20.1.3.1: std::stop_callback

Before usingstd::stop_callback objects the<stop_token> header file must be included.

In addition to merely ending thread functions viajthread's request_stopmember function it's also possible to associatecallbackfunctions withrequest_stop, which are executed whenrequest_stop iscalled. In situations where callback functions are registered when thethread function has already been stopped the callback functions areimmediately called when they are being registered (registering callbackfunctions is covered below).

Note that multiple callback functions can be registered. However, the order inwhich these callback functions are run once the thread is stopped is notdefined. Moreover, exceptions may not leave callback functions or the programends by callingstd::terminate.

Callback functions are registered by objects of the classstd::stop_callback. The classstop_callback offers the followingconstructors:

explicit stop_callback(std::stop_token const &st, Function &&cb) noexcept;
explicit stop_callback(std::stop_token &&st, Function &&cb) noexcept;

Notes:

Function can be the name of a (void) function without parameters or it can be an (anonymous or existing) object offering a parameter-less (void) function call operator. The functions do not necessarily have to be void functions, but their return values are ignored;
Thenoexcept is only used ifFunction is also declared asnoexcept (ifFunction is the name of a functor-class thennoexcept is used if its constructor is declared withnoexcept);
The classstop_callback does not offer copy/move construction and assignment.

Here is the example used in the previous section, this time defining acallback function. When running this program its output is

    next    next    next    stopFun called via stop_callback

     1: void fun(std::stop_token stop)     2: {     3:     while (not stop.stop_requested())     4:     {     5:         cout << "next\n";     6:         this_thread::sleep_for(1s);     7:     }     8: }     9:     10: void stopFun()    11: {    12:     cout << "stopFun called via stop_callback\n";    13: }    14:     15: int main()    16: {    17:     jthread thr(fun);    18:     19:     stop_callback sc{ thr.get_stop_token(), stopFun };    20:     21:     this_thread::sleep_for(3s);    22:     23:     thr.request_stop();    24:     thr.join();    25: }

The functionfun is identical to the one shown in the previoussection, butmain defines (line 19) thestop_callback objectsc,passing itthr's get_stop_token's return value and the address of thefunctionstopFun, defined in lines 10 thru 13. In this case oncerequest_stop is called (line 23) the callback functionstopFun iscalled as well.

20.2: Synchronization (mutexes)

Objects ofmutex classes are used to protect shared data.

Before using mutexes the<mutex> header file must be included.

One of the key characteristics of multi-threaded programs is that threads mayshare data. Functions running as separate threads have access to all globaldata, and may also share the local data of their parent threads. However,unless proper measures are taken, this may easily result in data corruption,as illustrated by the following simulation of some steps that could beencountered in a multi-threaded program:

---------------------------------------------------------------------------Time step:    Thread 1:     var        Thread 2:       description---------------------------------------------------------------------------    0                        5    1           starts                                  T1 active    2           writes var                              T1 commences writing    3           stopped                                 Context switch    4                                   starts          T2 active    5                                   writes var      T2 commences writing    6                       10          assigns 10      T2 writes 10    7                                   stopped         Context switch    8           assigns 12                              T1 writes 12    9                       12----------------------------------------------------------------------------

In this example, threads 1 and 2 share variablevar, initially havingthe value 5. At step 1 thread 1 starts, and starts to write a value intovar. However, it is interrupted by a context switch, and thread 2 isstarted (step 4). Thread 2also wants to write a value intovar, andsucceeds until time step 7, when another context switch takes place. By nowvar is 10. However, thread 1 was also in the process of writing a valueintovar, and it is given a chance to complete its work: it assigns 12tovar in time step 8. Once time step 9 is reached, thread 2 proceeds onthe (erroneous) assumption thatvar must be equal to 10. Clearly, from thepoint of view of thread 2 its data have been corrupted.

In this case data corruption was caused by multiple threads accessing the samedata in an uncontrolled way. To prevent this from happening, access to shareddata should be protected in such a way that only one thread at a time mayaccess the shared data.

Mutexes are used to prevent the abovementioned kinds of problems byoffering a guarantee that data are only accessed by the thread that could lockthe mutex that is used to synchronize access to those data.

Exclusive data access completely depends on cooperation between thethreads. If thread 1 uses mutexes, but thread 2 doesn't, then thread 2 mayfreely access the common data. Of course that's bad practice, which should beavoided.

It is stressed that althoughusing mutexes is the programmer'sresponsibility, theirimplementation isn't: mutexes offer the necessaryatomic calls. When requesting a mutex-lock the thread is blocked (i.e., themutex statement does not return) until the lock has been obtained by therequesting thread.

Apart from the classstd::mutex the classstd::recursive_mutex is available. When arecursive_mutex is calledmultiple times by the same thread it increases its lock-count. Before otherthreads may access the protected data the recursive mutex must be unlockedagain that number of times. Moreover, the classesstd::timed_mutex andstd::recursive_timed_mutex are available. Their locks expire when released, but also after a certainamount of time.

The members of the mutex classes performatomic actions: no contextswitch occurs while they are active. So when two threads are trying tolock a mutex only one can succeed. In the above example: if both threadswould use a mutex to control access tovar thread 2 would not have beenable to assign 12 tovar, with thread 1 assuming that its value was 10. Wecould even have two threads running purely parallel (e.g., on two separatecores). E.g.:

-------------------------------------------------------------------------Time step:    Thread 1:        Thread 2:        description-------------------------------------------------------------------------    1         starts           starts           T1 and T2 active    2         locks            locks            Both threads try to                                                 lock the mutex    3         blocks...        obtains lock     T2 obtains the lock,                                                and T1 must wait    4         (blocked)        processes var    T2 processes var,                                                T1 still blocked    5         obtains lock     releases lock    T2 releases the lock,                                                and T1 immediately                                                 obtains the lock    6         processes var                     now T1 processes var    7         releases lock                     T1 also releases the lock-------------------------------------------------------------------------

Although mutexes can directly be used in programs, this rarely happens. It ismore common to embed mutex handling in locking classes that make sure that themutex is automatically unlocked again when the mutex lock is no longerneeded. Therefore, this section merely offers an overview of the interfaces ofthe mutex classes. Examples of their use will be given in the upcomingsections (e.g., section20.3).

All mutex classes offer the following constructors and members:

mutex() constexpr:
The defaultconstexpr constructor is the only availableconstructor;
~mutex():
The destructor doesnot unlock a locked mutex. If locked it mustexplicitly be unlocked using the mutex'sunlock member;
void lock():
The calling thread blocks until it owns the mutex. Unlesslock iscalled for a recursive mutex asystem_error is thrown if the threadalready owns the lock. Recursive mutexes increment their internallock count;
bool try_lock() noexcept:
The calling thread tries to obtain ownership of the mutex. Ifownership is obtained,true is returned, otherwisefalse. If thecalling thread already owns the locktrue is also returned, and in thiscase a recursive mutex also increments its internallock count;
void unlock() noexcept:
The calling thread releases ownership of the mutex. Asystem_error is thrown if the thread does not own thelock. A recursive mutex decrements its interal lock count, releasingownership of the mutex once the lock count has decayed to zero;

The timed-mutex classes (timed_mutex, recursive_timed_mutex) also offerthese members:

bool try_lock_for(chrono::duration<Rep, Period> const &relTime) noexcept:
The calling thread tries to obtain ownership of the mutex within thespecified time interval. If ownership is obtained,true is returned,otherwisefalse. If the calling thread already owns the locktrue isalso returned, and in this case a recursive timed mutex also increments itsinternallock count. TheRep andDuration types are inferred fromthe actualrelTime argument. E.g.,
```
std::timed_mutex timedMutex;timedMutex.try_lock_for(chrono::seconds(5));
```
bool try_lock_until(chrono::time_point<Clock, Duration> const &absTime) noexcept:
The calling thread tries to obtain ownership of the mutex untilabsTime has passed. If ownership is obtained,true is returned,otherwisefalse. If the calling thread already owns the locktrue isalso returned, and in this case a recursive timed mutex also increments itsinternallock count. TheClock andDuration types are inferredfrom the actualabsTime argument. E.g.,
```
std::timed_mutex timedMutex;timedMutex.try_lock_until(chrono::system_clock::now() + chrono::seconds(5));
```

20.2.1: Initialization in multi-threaded programs

Before using thestd::once_flag and thestd::call_once function, introduced in this section, the<mutex> header file must be included.

In single threaded programs the initialization of global data does notnecessarily occur at one point in code. An example is the initialization ofthe object of a singleton class (cf.Gamma et al. (1995), Design Patterns,Addison-Wesley). Singleton classes may define a single static pointer datamemberSingleton *s_object, pointing to the singleton's object, and mayoffer a static memberinstance, implemented something like this:

    Singleton &Singleton::instance()    {        return s_object ?                     s_object                 :                     (s_object = new Singleton);    }

With multi-threaded programs this approach immediately gets complex. Forexample, if two threads callinstance at the same time, whiles_objectstill equals 0, then both may callnew Singleton, resulting in onedynamically allocatedSingleton object becoming unreachable. Otherthreads, called afters_object was initialized for the first time, mayeither return a reference to that object, or may return a reference to theobject initialized by the second thread. Not exactly the expected behavior ofa singleton.

Mutexes (cf. section20.2) can be used to solve these kinds of problems,but they result in some overhead and inefficiency, as the mutex must beinspected at each call ofSingleton::instance.

When variables must dynamically be initialized, and the initialization shouldtake place only once thestd::once_flag type and thestd::call_oncefunction should be used.

Thecall_once function expects two or three arguments:

The first argument is aonce_flag variable, keeping track of theactual initialization status. Thecall_once function simply returns iftheonce_flag indicates that initialization already took place;
The second argument is the address of a function which must be calledonly once. This function may be a free function or it may be the address of aclass member function;
If the second argument is the address of a class member function,then the object for which the member function should be called must beprovided ascall_once's third argument.

A thread-safe implementation of the singleton'sinstance function cannow easily be designed (using in-class implementations for brevity):

    class Singleton    {         static std::once_flag s_once;        static Singleton *s_singleton;        ...        public:            static Singleton *instance()            {                std::call_once(s_once, []{s_singleton = new Singleton;} );                return s_singleton;            }        ...    };

However, there are additional ways to initialize data, even for multi-threaded programs:

First, suppose a constructor is declared with theconstexprkeyword (cf. section8.1.4.1), satisfying the requirements for constantinitialization. In this case, a static object, initialized using thatconstructor, is guaranteed to be initialized before any code is run as part ofthe static initialization phase. This is used bystd::mutex, as iteliminates the possibility of race conditions when global mutexes areinitialized.

Second, a static variable defined within a compound statement may beused (e.g., a static local variable within a function body). Staticvariables defined within a compoundstatement are initialized the first time the function is called at the pointin the code where the static variable is defined. Here is an example:

        #include <iostream>        struct Cons        {            Cons()            {                std::cout << "Cons called\n";            }        };        void called(char const *time)        {            std::cout << time << "time called() activated\n";            static Cons cons;        }        int main()        {            std::cout << "Pre-1\n";            called("first");            called("second");            std::cout << "Pre-2\n";            Cons cons;        }    /*        Displays:            Pre-1            firsttime called() activated            Cons called            secondtime called() activated            Pre-2            Cons called    */

This feature causes a thread to wait automatically if another thread isstill initializing the static data (note thatnon-static data never causeproblems, as non-static local variables only exist within their own thread ofexecution).

20.2.2: Shared mutexes

Shared mutexes (via the typestd::shared_mutex) are available afterincluding the<shared_mutex> header file. Shared mutex types behave liketimed_mutex types and optionally have the characteristics described below.

The classshared_mutex provides a non-recursive mutex with sharedownership semantics, comparable to, e.g., theshared_ptr type.A program usingshared_mutexes is undefined if:

it destroys a shared_mutex object owned by any thread;
a thread recursively attempts to gain ownership of ashared_mutex;
a thread terminates while owning ashared_mutex.

Shared mutex types provide a shared lock ownership mode. Multiple threads cansimultaneously hold a shared lock ownership of ashared_mutex type ofobject. But no thread can hold a shared lock while another thread holds anexclusive lock on the sameshared_mutex object, and vice-versa.

Shared mutexes are useful in situations where multiple threads (consumers)want to access information for reading: the consumers don't want to change thedata, but merely want to retrieve them. At some point another thread (theproducer) wants to modify the data. At that point the producer requestsexclusive access to the data, and is forced to wait until all consumers havereleased their locks. While the producer waits for the exclusive lock, newconsumers' requests for shared locks remain pending until the producer hasreleased the exclusive lock. Thus, reading is possible for many threads, butfor writing the exclusive lock guarantees that no other threads can access thedata.

The typeshared_mutex offers the following members providing shared lockownership. To obtain exclusive ownership omit the_shared from thefollowing member functions:

void lock_shared():
Blocks the calling thread until shared ownership of the mutex can beobtained by the calling thread. An exception is thrown if the current threadalready owns the lock, if it is not allowed to lock the mutex, or if the mutexis already locked and blocking is not possible;
void unlock_shared():
Releases a shared lock on the mutex held by the calling thread. Nothinghappens if the current thread does not already own the lock;
bool try_lock_shared():
The current thread attempts to obtain shared ownership of the mutexwithout blocking. If shared ownership is not obtained, there is no effect andtry_lock_shared immediately returns. Returnstrue if the sharedownership lock was acquired,false otherwise. An implementation may failto obtain the lock even if it is not held by any other thread. Initially thecalling thread may not yet own the mutex;
bool try_lock_shared_for(rel_time):
Attempts to obtain shared lock ownership for the calling thread withinthe relative time period specified byrel_time. If the time specified byrel_time is less than or equal torel_time.zero(), the member attemptsto obtain ownership without blocking (as if by callingtry_lock_shared()). The member shall return within the time intervalspecified byrel_time only if it has obtained shared ownership of the mutexobject. Returnstrue if the shared ownership lock was acquired,falseotherwise. Initially the calling thread may not yet own the mutex;
bool try_lock_shared_until(abs_time):
Attempts to obtain shared lock ownership for the calling thread until thetime specified byabs_time has passed. If the time specified byabs_time has already passed then the member attempts to obtain ownershipwithout blocking (as if by callingtry_lock_shared()). Returnstrueif the shared ownership lock was acquired,false otherwise. Initially thecalling thread may not yet own the mutex;

20.3: Locks and lock handling

Locks are used to simplify the use of mutexes. Before locks can be used the<mutex> header file must be included.

Whenever threads share data, and at least one of the threads may change commondata, mutexes should be used to prevent threads from using the same datasynchronously.

Usually locks are released at the end of action blocks. This requires explicitcalls to the mutexes'unlock function, which introduces comparableproblems as we've seen with the thread'sjoin member.

To simplify locking and unlocking two mutex wrapper classes are available:

std::lock_guard:
objects of this class offer the basic unlock-guarantee: theirdestructors call the memberunlock of the mutexes they control;
std::unique_lock:
objects of this class offer a more extensive interface, allowingexplicit unlocking and locking of the mutexes they control, while theirdestructors preserve the unlock-guarantee also offered bylock_guard;

The classlock_guard offers a limited, but useful interface:

lock_guard<Mutex>(Mutex &mutex):
when defining alock_guard object the mutex type (e.g.,std::mutex, std::timed_mutex, std::shared_mutex) is specified, and a mutexof the indicated type is provided as its argument. The construction blocksuntil thelock_guard object owns the lock. Thelock_guard's destructorautomatically releases the mutex lock.
lock_guard<Mutex>(Mutex &mutex, std::adopt_lock_t):
this constructor is used to transfer control over the mutex from thecalling thread to thelock_guard. The mutex lock is released again by thelock_guard's destructor. At construction time the mutex must already beowned by the calling thread. Here is an illustration of how it can be used:
```
 1: void threadAction(std::mutex &mut, int &sharedInt) 2: { 3:     std::lock_guard<std::mutex> lg{mut, std::adopt_lock_t()}; 4:     // do something with sharedInt 5: }
```
- At line 1threadAction receives a reference to amutex. Assume the mutex owns the lock;
- At line 3 control is transferred to thelock_guard. Eventhough we don't explicitly use thelock_guard object, an object should bedefined to prevent the compiler from destroying an anonymous object before thefunction ends;
- When the function ends, at line 5, the mutex's lock isreleased by thelock_guard's destructor.
mutex_type:
in addition to the constructors and destructor,lock_guard<Mutex> types also define the typemutex_type: it is asynonym of theMutex type that is passed to thelock_guard'sconstructor.

Here is a simple example of a multi-threaded program usinglock_guardsto prevent information inserted intocout from getting mixed.

    bool oneLine(istream &in, mutex &mut, int nr)    {       lock_guard<mutex> lg(mut);            string line;        if (not getline(in, line))            return false;            cout << nr << ": " << line << endl;            return true;    }        void io(istream &in, mutex &mut, int nr)    {        while (oneLine(in, mut, nr))            this_thread::yield();    }        int main(int argc, char **argv)    {        ifstream in(argv[1]);        mutex ioMutex;            thread t1(io, ref(in), ref(ioMutex), 1);        thread t2(io, ref(in), ref(ioMutex), 2);        thread t3(io, ref(in), ref(ioMutex), 3);            t1.join();        t2.join();        t3.join();    }

As withlock_guard, a mutex-type must be specified when definingobjects of the classstd::unique_lock. The classunique_lock is much more elaborate than the basiclock_guard classtemplate. Its interface does not define a copy constructor or overloadedassignment operator, but itdoes define a move constructor and a moveassignment operator. In the following overview ofunique_lock's interfaceMutex refers to the mutex-type that is specified when defining aunique_lock:

unique_lock() noexcept:
the default constructor is not yet associated with amutexobject. It must be assigned amutex (e.g., using move-assignment) beforeit can do anything useful;
explicit unique_lock(Mutex &mutex):
initializes aunique_lock with an existingMutex object, andcallsmutex.lock();
unique_lock(Mutex &mutex, defer_lock_t) noexcept:
initializes aunique_lock with an existingMutex object, butdoes not callmutex.lock(). Call it by passing adefer_lock_t objectas the constructor's second argument, e.g.,
```
unique_lock<mutex> ul(mutexObj, defer_lock_t())
```
unique_lock(Mutex &mutex, try_to_lock_t) noexcept:
initializes aunique_lock with an existingMutex object, andcallsmutex.try_lock(): the constructor won't block if the mutex cannot belocked;
unique_lock(Mutex &mutex, adopt_lock_t) noexcept:
initializes aunique_lock with an existingMutex object,and assumes that the current thread has already locked the mutex;
unique_lock(Mutex &mutex, chrono::duration<Rep, Period> const &relTime) noexcept:
this constructor tries to obtain ownership of theMutex object bycallingmutex.try_lock_for(relTime). The specified mutex type musttherefore support this member (e.g., it is astd::timed_mutex). It couldbe called like this:
```
std::unique_lock<std::timed_mutex> ulock(timedMutex,                                          std::chrono::seconds(5));
```
unique_lock(Mutex &mutex, chrono::time_point<Clock, Duration> const &absTime) noexcept:
this constructor tries to obtain ownership of theMutex object bycallingmutex.try_lock_until(absTime). The specified mutex type musttherefore support this member (e.g., it is astd::timed_mutex). This constructor could be called like this:
```
std::unique_lock<std::timed_mutex> ulock(                timedMutex,                 std::chrono::system_clock::now() + std::chrono::seconds(5)            );
```
void lock():
blocks the current thread until ownership of the mutex that is managedby theunique_lock is obtained. If no mutex is currently managed, then asystem_error exception is thrown.
Mutex *mutex() const noexcept:
returns a pointer to the mutex object stored inside theunique_lock (anullptr is returned if no mutex object is currentlyassociated with theunique_lock object.)
explicit operator bool() const noexcept:
returnstrue if theunique_lock owns a locked mutex, otherwisefalse is returned;
unique_lock& operator=(unique_lock &&tmp) noexcept:
if the left-hand operand owns a lock, it will call its mutex'sunlock member, whereaftertmp's state is transferred to the left-handoperand;
bool owns_lock() const noexcept:
returnstrue if theunique_lock owns the mutex, otherwisefalse is returned;
Mutex *release() noexcept:
returns a pointer to the mutex object that is associated with theunique_lock object, discarding that association;
void swap(unique_lock& other) noexcept:
swaps the states of the currentunique_lock andother;
bool try_lock():
tries to obtain ownership of the mutex that is associated with theunique_lock, returningtrue if this succeeds, andfalseotherwise. If no mutex is currently associated with theunique_lockobject, then asystem_error exception is thrown;
bool try_lock_for(chrono::duration<Rep, Period> const &relTime):
this member function tries to obtain ownership of theMutex objectmanaged by theunique_lock object by calling the mutex'stry_lock_for(relTime) member. The specified mutex type must thereforesupport this member (e.g., it is astd::timed_mutex);
bool try_lock_until(chrono::time_point<Clock, Duration> const &absTime):
this member function tries to obtain ownership of theMutex objectmanaged by theunique_lock object by calling the mutex'smutex.try_lock_until(absTime) member. The specified mutex type musttherefore support this member (e.g., it is astd::timed_mutex);
void unlock():
releases ownership of the mutex (or reduces the mutex's lock count). Asystem_error exception is thrown if theunique_lock object does notown the mutex.

In addition to the members of the classesstd::lock_guard andstd::unique_lock the functionsstd::lock andstd::try_lock are available. These functions can beused to preventdeadlocks, the topic of the next section.

20.3.1: Name-independent declarations

A common error when usinglock_guards is defining it as an anonymousobject:

    void Class::notLocked()    {        lock_guard<mutex>{d_mutex};        // using data available to multiple threads    }

In cases like these, since thelock_guard is defined as an anonymousobject, it's immediately destroyed after its construction offering no guardagainst multiple threads using the shared data.

Traditionally this situation is solved by explicitly defining an object. Theobject's name is irrelevant, because it's used nowhere else, resulting inconstructions like

    void Class::lockedOK()    {        lock_guard<mutex> guard{d_mutex};        // using data available to multiple threads    }

But in this context the name's irrelevant and not used nowhere elsew inthe function.

Since the C++26 standard, however, a generalized alternative approachis available. It's called name-independent declaration Very simple (and broadly applicable), requiring--std=c++26 orbeyond, which is supported sinceg++-14.

The 'name'_ (a single underscore) results in a name-independentdeclaration. It's definitelynot a name we would use for 'common'variables, but starting with the C++26 standard a variable named `_' implies aname-independent declaration. So in 'lockedOK' we can now do:

    void Class::lockedOK()    {        lock_guard<mutex> _{d_mutex};        // using data available to multiple threads    }

and we never have to think again about how to name a required (but notused by us) variable or object. As long as there's no ambiguity it's evenpossible to define multiple `_' variables or objects. As long as they're notbeing used by name it's possible to define, e.g.,

    void neverUsed()    {        int _{12};        int _{43};      // a different int because of the initialization        auto _ = 42;            // 'auto' also works fine        auto _("hello world");  // a string...    }

But in practice, use name independent declarations as illustrated in theabovelockedOK function.

20.3.2: Deadlocks

A deadlock occurs when two locks are required to process data, but one threadobtains the first lock and another thread obtains the second lock.C++defines the genericstd::lock andstd::try_lock functions that canbe used to help preventing such situations.

Before these functions can be used the<mutex> header file must beincluded

In the following overviewL1 &l1, ... represents one or morereferences to objects of lockable types:

void std::lock(L1 &l1, ...):
When the function returns locks were obtained on allli objects. If a lock could not be obtained for at least one of the objects, then all locks obtained so far are relased, even if the object for which no lock could be obtained threw an exception;
int std::try_lock(L1 &l1, ...):
This function calls the lockable objects'try_lock members. If all locks could be obtained, then -1 is returned. Otherwise the (0-based) index of the first argument which could not be locked is returned, releasing all previously obtained locks.

As an example consider the following little multi-threaded program: Thethreads use mutexes to obtain unique access tocout and to anintvalue. However,fun1 first lockscout (line 7), and thenvalue(line 10);fun2 first locksvalue (line 16) and thencout (line19). Clearly, iffun1 has lockedcoutfun2 can't obtain the lockuntilfun1 has released it. Unfortunately,fun2 has lockedvalue,and the functions only release their locks when returning. But in order toaccess the information invaluefun1 it must have obtained a lock onvalue, which it can't, asfun2 has already lockedvalue: thethreads are waiting for each other, and neither thread gives in.

     1:      2: int value;     3: mutex valueMutex;     4: mutex coutMutex;     5:      6: void fun1()     7: {     8:     lock_guard<mutex> lg1(coutMutex);     9:     cout << "fun 1 locks cout\n";    10:     11:     lock_guard<mutex> lg2(valueMutex);    12:     cout << "fun 1 locks value\n";    13: }    14:     15: void fun2()    16: {    17:     lock_guard<mutex> lg1(valueMutex);    18:     cerr << "fun 2 locks value\n";    19:     20:     lock_guard<mutex> lg2(coutMutex);    21:     cout << "fun 2 locks cout\n";    22: }    23:     24: int main()    25: {    26:     thread t1(fun1);    27:     fun2();    28:     t1.join();    29: }    30:

A good recipe for avoiding deadlocks is to prevent nested (or multiple) mutexlock calls. But if multiple mutexes must be used, always obtain the locks inthe same order. Rather than doing this yourself,std::lock andstd::try_lock should be used whenever possible to obtain multiple mutexlocks. These functions accept multiple arguments, which must be lockable typeslikelock_guard, unique_lock, or even a plainmutex. The previousdeadlocking program, can be modified to callstd::lock to lock bothmutexes. In this example using one single mutex would also work, but themodified program now looks as similar as possible to the previousprogram. Note how in lines 10 and 21 a differentordering of theunique_locks arguments was used: it is not necessary touse an identical argument order when callingstd::lock orstd::try_lock.

     1: int value;     2: mutex valueMutex;     3: mutex coutMutex;     4:      5: void fun1()     6: {     7:     scoped_lock sl{ coutMutex, valueMutex };     8:     cout << "fun 1 locks cout\n";     9:     sleep(1);    10:     cout << "fun 1 locks value\n";    11: }    12:     13: void fun2()    14: {    15:     scoped_lock sl{ valueMutex, coutMutex };    16:     cout << "fun 2 locks value\n";    17:     sleep(1);    18:     cout << "fun 2 locks cout\n";    19: }    20:     21: int main()    22: {    23:     thread t1(fun1);    24:     fun2();    25:     t1.join();    26: }    27: //  Displays:    28: //    fun 2 locks value    29: //    fun 2 locks cout    30: //    fun 1 locks cout    31: //    fun 1 locks value

20.3.3: Shared locks

Shared locks are available through the typestd::shared_lock, after including the<shared_mutex> header file.

An object of the typestd::shared_lock controls the shared ownership of alockable object within a scope. Shared ownership of the lockable object may beacquired at construction time or thereafter, and once acquired, it may betransferred to anothershared_lock object. Objects of typeshared_lockcannot be copied, but move construction and assignment is supported.

The behavior of a program is undefined if the contained pointer to a mutex(pm) has a non-zero value and the lockable object pointed to bypm doesnot exist for the entire remaining lifetime of theshared_lockobject. The supplied mutex type must be ashared_mutex or a type havingthe same characteristics.

The typeshared_lock offers the following constructors, destructor andoperators:

shared_lock() noexcept:
The default constructor creates ashared_lock which is not ownedby a thread and for whichpm == 0;
explicit shared_lock(mutex_type &mut):
This constructor locks the mutex, callingmut.lock_shared(). Thecalling thread may not already own the lock. Following the constructionpm== &mut, and the lock is owned by the current thread;
shared_lock(mutex_type &mut, defer_lock_t) noexcept:
This constructor assignspm to&mut, but the calling thread does not own the lock;
shared_lock(mutex_type &mut, try_to_lock_t):
This constructor tries to locks the mutex, callingmut.try_lock_shared(). The calling thread may not already own thelock. Following the constructionpm == &mut, and the lock may or may notbe owned by current thread, depending on the return value oftry_lock_shared;
shared_lock(mutex_type &mut, adopt_lock_t):
This constructor can be called if the calling thread has sharedownership of the mutex. Following the constructionpm == &mut, and thelock is owned by the current thread;
shared_lock(mutex_type &mut, chrono::time_point<Clock, Duration> const &abs_time):
This constructor is a member template, whereClock andDuration are types specifying a clock and absolute time (cf. section4.2). It can be called if the calling thread does not already ownthe mutex. It callsmut.try_lock_shared_until(abs_time). Following theconstructionpm == &mut, and the lock may or may not be owned by currentthread, depending on the return value oftry_lock_shared_until;
shared_lock(mutex_type &mut, chrono::duration<Rep, Period> const &rel_time):
This constructor is a member template, whereClock andPeriod are types specifying a clock and relative time (cf. section4.2). It can be called if the calling thread does not already ownthe mutex. It callsmut.try_lock_shared_for(abs_time). Following theconstructionpm == &mut, and the lock may or may not be owned by currentthread, depending on the return value oftry_lock_shared_for;
shared_lock(shared_lock &&tmp) noexcept:
The move constructor transfers the information intmp to thenewly constructedshared_lock. Following the constructiontmp.pm == 0andtmp no longer owns the lock;
~shared_lock():
If the lock is owned by the current thread,pm->unlock_shared() is called;
shared_lock &operator=(shared_lock &&tmp) noexcept (The move assignment operator callspm->unlock_shared and thentransfers the information intmp to thecurrentshared_lock object. Following thistmp.pm == 0andtmp no longer owns the lock;)
explicit operator bool () const noexcept:
Returns whether or not theshared_lock object owns the lock.

The following members are provided:

void lock():
Callspm->lock_shared(), after which the current tread owns theshared lock. Exceptions may be thrown fromlock_shared, and otherwise ifpm == 0 or if the current thread already owns the lock;
mutex_type *mutex() const noexcept:
Returnspm;
mutex_type *release() noexcept:
Returns the previous value ofpm, which is equal to zero aftercalling this member. Also, the current object no longer owns the lock;
void swap(shared_lock &other) noexcept:
Swaps the data members of the current and theothershared_lockobjects. There is also a free memberswap, a function template, swappingtwoshared_lock<Mutex> objects, whereMutex represents the mutex typefor which the shared lock objects were instantiated:voidswap(shared_lock<Mutex> &one, shared_lock<Mutex> &two) noexcept;
bool try_lock():
Callspm->try_lock_shared(), returning this call's return value.Exceptions may be thrown fromtry_lock_shared, andotherwise ifpm == 0 or if the current thread already owns the lock;
bool try_lock_for(const chrono::duration<Rep, Period>&rel_time):
A member template, whereClock andPeriod are types specifying aclock and relative time (cf. section4.2). It callsmut.try_lock_shared_for(abs_time). Following the call the lock may or maynot be owned by current thread, depending on the return value oftry_lock_shared_until. Exceptions may be thrown fromtry_lock_shared_for, and otherwise ifpm == 0 or if the current threadalready owns the lock;
bool try_lock_until(const chrono::time_point<Clock,Duration>& abs_time):
A member template, whereClock andDuration are types specifyinga clock and absolute time (cf. section4.2). It callsmut.try_lock_shared_until(abs_time), returning its return value. Followingthe call the lock may or may not be owned by current thread, depending on thereturn value oftry_lock_shared_until. Exceptions may be thrown fromtry_lock_shared_until, and otherwise ifpm == 0 or if the currentthread already owns the lock;
void unlock():
Unlocks the shared mutex lock, releasing its ownership. Throws anexception if the shared mutex was not owned by the current thread.

20.3.4: Scoped locks

Deadlocks can be avoided using the principles described in the previoussection. However, instead of placing the responsibility for avoiding deadlockson the shoulders of the software engineer, an alternative approach isavailable: ascoped_lock can be used to lock multiple semaphores at once,where thescoped_lock ensures that deadlocks are avoided.

Thescoped_lock also has a default constructor, performing no actions, soit's up to the software engineer to definescoped_lock objects with atleast onemutex. Before usingscoped_lock objects the<mutex>header file must be included. Adapting the example from section20.3.2: both functions define ascoped_lock (note that the orderin which the mutexes are specified isn't relevant), and deadlocks are do notoccur:

     1:      2: int value;     3: mutex valueMutex;     4: mutex coutMutex;     5:      6: void fun1()     7: {     8:     unique_lock<mutex> lg1(coutMutex, defer_lock);     9:     unique_lock<mutex> lg2(valueMutex, defer_lock);    10:     11:     lock(lg1, lg2);    12:     13:     cout << "fun 1 locks cout\n";    14:     cout << "fun 1 locks value\n";    15: }    16:     17: void fun2()    18: {    19:     unique_lock<mutex> lg1(coutMutex, defer_lock);    20:     unique_lock<mutex> lg2(valueMutex, defer_lock);    21:     22:     lock(lg2, lg1);    23:     24:     cout << "fun 2 locks cout\n";    25:     cout << "fun 2 locks value\n";    26: }    27:     28: int main()    29: {    30:     thread t1(fun1);    31:     thread t2(fun2);    32:     t1.join();    33:     t2.join();    34: }    35:

Thus, instead of usinglock_guard objects,scoped_lock objects can beused. It's a matter of taste whetherlock_guards orscoped_locksshould be preferred when only one mutex is used. Maybescoped_lock shouldbe preferred, since it always works....

20.4: Event handling (condition variables)

This section introducescondition variables. Condition variables allowprograms to synchronize threads using thestates of data, rather thansimply locking theaccess to data (which is realized using mutexes).

Before condition variables can be used the<condition_variable> headerfile must be included.

To start our discussion, consider a classic producer-consumer scenario: theproducer generates items which are consumed by a consumer. The producer canonly produce a certain number of items before its storage capacity has filledup and the client cannot consume more items than the producer has produced.

At some point the producer's storage capacity has filled to the brim, and theproducer has to wait until the client has at least consumed some items,thereby creating space in the producer's storage. Similarly, the consumercannot start consuming until the producer has at least produced some items.

Implementing this scenario only using mutexes (data locking) is not anattractive option, as merely using mutexes forces a program to implement thescenario usingpolling: processes must continuously (re)acquire themutex's lock, determine whether they can perform some action, followed by therelease of the lock. Often there's no action to perform, and the process isbusy acquiring and releasing the mutex's lock. Polling forces threads to waituntil they can lock the mutex, even though continuation might already bepossible. The polling interval could be reduced, but that too isn't anattractive option, as that increases the overhead associated with handling themutexes (also called `busy waiting').

Condition variables can be used to prevent polling. Threads can use conditionvariables tonotify waiting threads that there is something for them todo. This way threads can synchronize on data values (states).

As data values may be modified by multiple threads, threads still need to usemutexes, but only for controlling access to the data. In addition, conditionvariables allow threads torelease ownership of mutexes until a certainvalue has been obtained, until a preset amount of time has been passed, oruntil a preset point in time has been reached.

The prototypical setup of threads using condition variables looks like this:

consumer thread(s) act like this:

    lock the mutex    while the required condition has not yet been attained (i.e., is false):        wait until being notified          (this automatically releasing the mutex's lock).    once the mutex's lock has been reacquired, and the required condition    has been attained:        process the data    release the mutex's lock.

producer thread(s) act similarly:

    lock the mutex    while the required condition has not yet been attained:        do something to attain the required condition    notify waiting threads (that the required condition has been attained)    release the mutex's lock.

No matter which thread starts, the thread holding the mutex's lock will atsome point release the lock, allowing the other process to (re)acquire it. Ifthe consumer starts it immediately releases the lock once it enters itswaiting state; if the producer starts it releases the lock once the conditionis true.

This protocol hides a subtle initial synchronization requirement. The consumerwill miss the producer's notification if it (i.e., the consumer) hasn't yetentered its waiting state. Sowaiting (consumer) threads should startbefore notifying (producer) threads. Once threads have started, noassumptions can be made anymore about the order in which any of the conditionvariable's members (notify_one, notify_all, wait, wait_for, andwait_until) are called.

Condition variables come in two flavors: objects of the classstd::condition_variable are used in combinationwith objects of typeunique_lock<mutex>. Because of optimizations which are available for this specific combination usingcondition_variables is somewhat more efficient than using the moregenerally applicable classstd::condition_variable_any, which may beused with any (e.g., user supplied) lock type.

Condition variable classes (covered in detail in the next two sections) offermembers likewait, wait_for, wait_until, notify_one andnotify_allthat may concurrently be called. The notifying members are always atomicallyexecuted. Execution of thewait members consists of three atomic parts:

the mutex is released, and the thread is suspended until its notification;
Once the notification has been received, the lock is reacquired
The wait state ends (and processing continues beyond thewait call).

So, returning fromwait-members the previously waiting thread has reacquired the mutex's lock.

In addition to the condition variable classes the following free function andenum type is provided:

void std::notify_all_at_thread_exit(condition_variable &cond, unique_lock<mutex> lockObject):
once the current thread has ended, all other threads waiting oncond are notified. It is good practice to exit the thread as soon as possible after callingnotify_all_at_thread_exit.
Waiting threads must verify that the thread they were waiting for has indeed ended. This is usually realized by first obtaining the lock onlockObject, followed by verifying that the condition they were waiting for is true and that the lock was not reacquired beforenotify_all_at_thread_exit was called.
std::cv_status:
thecv_status enum is used by several member functions of the condition variable classes (cf. sections20.4.1 and20.4.2):
```
namespace std{    enum class cv_status     {         no_timeout,         timeout     };}
```

20.4.1: The class std::condition_variable

The classstd::condition_variable merely offers adefault constructor. No copy constructor or overloaded assignment operator isprovided.

Before using the classcondition_variable the<condition_variable>header file must be included.

The class's destructor requires that no thread is blocked by the threaddestroying thecondition_variable. So all threads waiting on acondition_variable must be notified before acondition_variableobject's lifetime ends. Callingnotify_all (see below) before acondition_variable's lifetime ends takes care of that, as thecondition_variable's thread releases its lock of themutex variable,allowing one of the notified threads to lock the mutex.

In the following member-descriptions a typePredicate indicates that aprovidedPredicate argument can be called as a function without arguments,returning abool. Also, other member functions are frequently referredto. It is tacitly assumed that all member referred to below were called usingthe same condition variable object.

The classcondition_variable supports severalwait members, whichblock the thread until notified by another thread (or after a configurablewaiting time). However,wait members may also spuriously unblock, withouthaving reacquired the lock. Therefore, returning fromwait members threadsshould verify that the required condition is actually true. If not,again callingwait may be appropriate. The next piece of pseudo codeillustrates this scheme:

    while (conditionNotTrue())        condVariable.wait(&uniqueLock);

The classcondition_variable's members are:

void notify_one() noexcept:
onewait member called by other threads returns. Which one actually returns cannot be predicted.
void notify_all() noexcept:
allwait members called by other threads unblock their wait states. Of course, only one of them will subsequently succeed in reacquiring the condition variable's lock object.
void wait(unique_lock<mutex>& uniqueLock):
before callingwait the current thread must have acquired the lock ofuniqueLock. Callingwait releases the lock, and the current thread is blocked until it has received a notification from another thread, and has reacquired the lock.
void wait(unique_lock<mutex>& uniqueLock, Predicate pred):
this is a member template, using the template headertemplate <typename Predicate>. The template's type is automatically derived from the function's argument type and does not have to be specified explicitly.
Before callingwait the current thread must have acquired the lock ofuniqueLock. As long as `pred' returnsfalsewait(lock) is called.
cv_status wait_for(unique_lock<mutex> &uniqueLock, std::chrono::duration<Rep, Period> const &relTime):
this member is defined as a member template, using the template headertemplate <typename Rep, typename Period>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly. E.g., to wait for at most 5 secondswait_for can be called like this:
```
cond.wait_for(&unique_lock, std::chrono::seconds(5));
```
This member returns when being notified or when the time interval specified byrelTime has passed.
When returning due to a timeout,std::cv_status::timeout is returned, otherwisestd::cv_status::no_timeout is returned.
Threads should verify that the required data condition has been met afterwait_for has returned.
bool wait_for(unique_lock<mutex> &uniqueLock, chrono::duration<Rep, Period> const &relTime, Predicate pred):
this member is defined as a member template, using the template headertemplate <typename Rep, typename Period, typename Predicate>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly.
As long aspred returns false, the previouswait_for member is called. If the previous member returnscv_status::timeout, thenpred is returned, otherwisetrue.
cv_status wait_until(unique_lock<mutex>& uniqueLock, chrono::time_point<Clock, Duration> const &absTime):
this member is defined as a member template, using the template headertemplate <typename Clock, typename Duration>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly. E.g., to wait until 5 minutes after the current timewait_until can be called like this:
```
cond.wait_until(&unique_lock, chrono::system_clock::now() +                              std::chrono::minutes(5));
```
This function acts identically to thewait_for(unique_lock<mutex> &uniqueLock, chrono::duration<Rep, Period> const &relTime) member described earlier, but uses an absolute point in time, rather than a relative time specification.
This member returns when being notified or when the time interval specified byrelTime has passed. When returning due to a timeout,std::cv_status::timeout is returned, otherwisestd::cv_status::no_timeout is returned.
bool wait_until(unique_lock<mutex> &lock, chrono::time_point<Clock, Duration> const &absTime, Predicate pred):
this member is defined as a member template, using the template headertemplate <typename Clock, typename Duration, typename Predicate>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly.
As long aspred returns false, the previouswait_until member is called. If the previous member returnscv_status::timeout, thenpred is returned, otherwisetrue.

Threads should verify that the required condition istrue whenwait-members of condition variables return.

20.4.2: The class std::condition_variable_any

Different from the classcondition_variable the classstd::condition_variable_any can be used withany (e.g., user supplied) lock type, and not just with the stl-providedunique_lock<mutex>.

Before using the classcondition_variable_any the<condition_variable>header file must be included.

The functionality that is offered bycondition_variable_any is identicalto the functionality offered by the classcondition_variable, albeit thatthe lock-type that is used bycondition_variable_any is notpredefined. The classcondition_variable_any therefore requires thespecification of the lock-type that must be used by its objects.

In the interface shown below this lock-type is referred to asLock. Most ofcondition_variable_any's members are defined asmember templates, defining aLock type as one of its parameters. Therequirements of these lock-types are identical to those of the stl-providedunique_lock, and user-defined lock-type implementations should provide atleast the interface and semantics that is also provided byunique_lock.

This section merely presents the interface of the classcondition_variable_any. As its interface offers the same members ascondition_variable (allowing, where applicable, passing any lock-typeinstead of justunique_lock to corresponding members), the reader isreferred to the previous section for a description of the semantics of theclass members.

Likecondition_variable, the classcondition_variable_any onlyoffers a default constructor. No copy constructor or overloaded assignmentoperator is provided.

Also, likecondition_variable, the class's destructor requires that nothread is blocked by the current thread. This implies that all other (waiting)threads must have been notified; those threads may, however, subsequentlyblock on the lock specified in theirwait calls.

Note that, in addition toLock, the typesClock, Duration, Period,Predicate, andRep are template types, defined just like the identicallynamed types mentioned in the previous section.

Assuming thatMyMutex is a user defined mutex type, and thatMyLock isa user defined lock-type (cf. section20.3 for details aboutlock-types), then acondition_variable_any object can be defined and usedlike this:

    MyMutex mut;    MyLock<MyMutex> ul(mut);    condition_variable_any cva;    cva.wait(ul);

These are the classcondition_variable_any's members:

void notify_one() noexcept;
void notify_all() noexcept;
void wait(Lock& lock);
void wait(Lock& lock, Predicate pred);
cv_status wait_until(Lock& lock, const chrono::time_point<Clock, Duration>& absTime);
bool wait_until(Lock& lock, const chrono::time_point<Clock, Duration>& absTime, Predicate pred);
cv_status wait_for(Lock& lock, const chrono::duration<Rep, Period>& relTime);
bool wait_for(Lock& lock, const chrono::duration<Rep, Period>& relTime,)Predicate pred;

20.4.3: An example using condition variables

Condition variables are used to synchronize threads on the values of data,rather than on the mere access to data (for which plain mutex-objects can beused). Using condition variables, a thread simply sleeps until it is notifiedby another thread. In a producer-consumer type of program this is usuallyaccomplished like this:

    consumer loop:        - wait until there's an item in store,            then reduce the number of stored items        - remove the item from the store        - increment the number of available storage locations        - do something with the retrieved item    producer loop:        - produce the next item        - wait until there's room to store the item,            then reduce the number of available storage locations        - store the item        - increment the number of stored items

It is important that the two storage administrative tasks (registering thenumber of available items and available storage locations) are eitherperformed by the client or by the producer. For the consumer `waiting' means:

Get a lock on the variable containing the actual count
As long as the count is zero: wait, releasing the lock until another thread has increased the count, then re-acquire the lock.
Reduce the count
Release the lock.

This scheme is implemented in a classSemaphore, offering memberswait andnotify_all. For a more extensive discussion of semaphores seeTanenbaum, A.S. (2016)Structured Computer Organization, Pearson Prentice-Hall.

As a brief summary: semaphores restrict the number of threads that can accessa resource of limited size. It ensures that the number of threads that additems to the resource (the producers) can never exceed the resource's maximumsize, or it ensures that the number of threads that retrieve items from theresource (the consumers) can never exceed the resource's current size. Thus,in a producer/consumer design two semaphores are used: one to control accessto the resource by the producers, and one to control access to the resource bythe consumers.

For example, say we have ten producing threads, as well as ten consumers, anda lockable queue that must not grow bigger than 1000 items.Producers try to push one item at a time; consumers try to pop one.

The data member containing the actual count is calledd_available. It isprotected bymutex d_mutex. In addition acondition_variabled_condition is defined:

    mutable std::mutex d_mutex;     // mutable because of its use in                                    // 'size_t size() const'    std::condition_variable d_condition;    size_t d_available;

The waiting process is implemented through its member functionwait:

     1: void Semaphore::wait()     2: {     3:     std::unique_lock<std::mutex> lk(d_mutex);   // get the lock     4:     while (d_available == 0)     5:         d_condition.wait(lk);   // internally releases the lock     6:                                 // and waits, on exit     7:                                 // acquires the lock again     8:     --d_available;              // dec. available     9: }   // the lock is released

In line 5d_condition.wait releases the lock. It waits until receivinga notification, and re-acquires the lock just before returning. Consequently,wait's code always has complete and unique control overd_available.

What about notifying a waiting thread? This is handled in lines 4 and5 of the member functionnotify_all:

     1: void Semaphore::notify_all()     2: {     3:     std::lock_guard<std::mutex> lk(d_mutex);    // get the lock     4:     if (d_available++ == 0)     5:         d_condition.notify_all();   // use notify_one to notify one other     6:                                     // thread     7: }   // the lock is released

At line 4d_available is always incremented; by using a postfixincrement it can simultaneously be tested for being zero. If it was initiallyzero thend_available is now one. A thread waiting untild_availableexceeds zero may now continue. A waiting thread is notified by callingd_condition.notify_one. In situations where multiple threads are waiting`notify_all' can also be used.

Using the facilities of the classSemaphore whose constructor expectsan initial value of itsd_available data member, the classicconsumer-producer paradigm can now be implemented usingmulti-threading (A more elaborate example of the producer-consumerprogram is found in theyo/threading/examples/events.cc file in theC++ Annotations's source archive):

    Semaphore available(10);    Semaphore filled(0);    std::queue<size_t> itemQueue;    std::mutex qMutex;    void consumer()    {        while (true)        {            filled.wait();            // mutex lock the queue:            {                std::lock_guard lg(qMutex);                size_t item = itemQueue.front();                itemQueue.pop();            }            available.notify_all();            process(item);      // not implemented here        }    }    void producer()    {        size_t item = 0;        while (true)        {            ++item;            available.wait();            // mutex lock the queue with multiple consumers            {                std::lock_guard lg(qMutex);                itemQueue.push(item);            }            filled.notify_all();        }    }    int main()    {        thread consume(consumer);        thread produce(producer);        consume.join();        produce.join();    }

Note that amutex is used to avoid simultaneous access to the queue bymultiple threads. Consider the situation where the queue contains 5 items: inthat situation the semaphores allow the consumer and the producer to accessthe queue, but to avoid currupting the queue only one of them may modify thequeue at a time. This is realized by both threads obtaining thestd::mutexqMutex lock before modifying the queue.

20.5: Atomic actions: mutexes not required

Before using the facilities introduced in this section the<atomic> headerfile must be included.

When data are shared among multiple threads, data corruption is usuallyprevented using mutexes. To increment a simpleint using this strategycode as shown below is commonly used:

    {        lock_guard<mutex> lk{ intVarMutex };        ++intVar;    }

The compound statement is used to limit thelock_guard's lifetime, sothatintVar is only locked for a short little while.

This scheme is not complex, but at the end of the day having to define alock_guard for every single use of a simple variable, and having to definea matching mutex for each simple variable is a bit annoying and cumbersome.

C++ offers a way out through the use ofatomic data types.Atomic data types are available for all basic types, and also for (trivial)user defined types. Trivial types are (see also section23.6.2) allscalar types, arrays of elements of a trivial type, and classes whoseconstructors, copy constructors, and destructors all have defaultimplementations, and their non-static data members are themselves of trivialtypes.

The class templatestd::atomic<Type> is available for allbuilt-in types, including pointer types. E.g.,std::atomic<bool> definesan atomicbool type. For many types alternative somewhat shortertype names are available. E.g, instead ofstd::atomic<unsigned short> thetypestd::atomic_ushort can be used. Refer to theatomic header filefor a complete list of alternate names.

IfTrivial is a user-defined trivial type thenstd::atomic<Trivial>defines an atomic variant ofTrivial: such a type does not require a separatemutex to synchronize access by multiple threads.

Objects of the class templatestd::atomic<Type> cannot directly be copiedor assigned to each other. However, they can be initialized by values of typeType, and values of typeType can also directly be assigned tostd::atomic<Type> objects. Moreover, sinceatomic<Type> types offerconversion operators returning theirType values, anatomic<Type>objects can also be assigned to or initialized by anotheratomic<Type>object using astatic_cast:

    atomic<int> a1 = 5;    atomic<int> a2{ static_cast<int>(a1) };

The classstd::atomic<Type> provides several public members, shownbelow. Non-member (free) functions operating onatomic<Type> objects arealso available.

Thestd::memory_order enumeration defines the following symbolicconstants, which are used to specify ordering constraints of atomic operations:

memory_order_acq_rel: the operation must be a read-modify-write operation, combiningmemory_order_acquire andmemory_order_release;
memory_order_acquire: the operation is an acquire operation. It synchronizes with a release operation that wrote the same memory location;
memory_order_consume: the operation is a consume operation on the involved memory location;
memory_order_relaxed: no ordering constraints are provided by the operation;
memory_order_release: the operation is a release operation. It synchronizes with acquire operations on the same location;
memory_order_sec_cst: the default memory order specification for all operations. Memory storing operations usememory_order_release, memory load operations usememory_order_acquire, and read-modify-write operations usememory_order_acq_rel.

The memory order cannot be specified for the overloaded operators provided byatomic<Type>. Otherwise, mostatomic member functions may also begiven a finalmemory_order argument. Where this is not available it isexplictly mentioned at the function's description.

Here are the standard availablestd::atomic<Type> member functions:

bool compare_exchange_strong(Type &currentValue, Type newValue) noexcept:
The value in the atomic object is compared tonewValue using byte-wise comparisons. If equal (andtrue is returned) thennewValue is stored in the atomic object; if unequal (andfalse is returned) the object's current value is stored incurrentValue;
bool compare_exchange_weak(Type &oldValue, Type newValue) noexcept:
The value in the atomic object is compared tonewValue using byte-wise comparisons. If equal (andtrue is returned), thennewValue is stored in the atomic object; if unequal, ornewValue cannot be atomically assigned to the current objectfalse is returned and the object's current value is stored incurrentValue;
Type exchange(Type newValue) noexcept:
The object's current value is returned, andnewValue is assigned to the current object;
bool is_lock_free() const noexept:
If the operations on the current object can be performed lock-freetrue is returned, otherwisefalse. This member has nomemory_order parameter;
Type load() const noexcept:
The object's value is returned;
operator Type() const noexcept:
The object's value is returned;
void store(Type newValue) noexcept:
NewValue is assigned to the current object. Note that the standard assignment operator can also be used.

In addition to the above members, integral atomic types `Integral'(essentially the atomic variants of all built-in integral types) also offerthe following member functions:

Integral fetch_add(Integral value) noexcept:
Value is added to the object's value, and the object's value at the time of the call is returned;
Integral fetch_sub(Integral value) noexcept:
Value is subtracted from the object's value, and the object's value at the time of the call is returned;
Integral fetch_and(Integral mask) noexcept:
Thebit-and operator is applied to the object's value andmask, assigning the resulting value to the current object. The object's value at the time of the call is returned;
Integral fetch_|=(Integral mask) noexcept:
Thebit-or operator is applied to the object's value andmask, assigning the resulting value to the current object. The object's value at the time of the call is returned;
Integral fetch_^=(Integral mask) noexcept:
Thebit-xor operator is applied to the object's value andmask, assigning the resulting value to the current object. The object's value at the time of the call is returned;
Integral operator++() noexcept:
The prefix increment operator, returning object's new value;
Integral operator++(int) noexcept:
The postfix increment operator, returning the object's value before it was incremented;
Integral operator--() noexcept
The prefix decrement operator, returning object's new value;
Integral operator--(int) noexcept
The postfix decrement operator, returning the object's value before it was decremented;
Integral operator+=(Integral value) noexcept:
Value is added to the object's current value and the object's new value is returned;
Integral operator-=(Integral value) noexcept:
Value is subtracted from the object's current value and the object's new value is returned;
Integral operator&=(Integral mask) noexcept:
Thebit-and operator is applied to the object's current value andmask, assigning the resulting value to the current object. The object's new value is returned;
Integral operator|=(Integral mask) noexcept:
Thebit-or operator is applied to the object's current value andmask, assigning the resulting value to the current object. The object's new value is returned;
Integral operator^=(Integral mask) noexcept:
Thebit-xor operator is applied to the object's current value andmask, assigning the resulting value to the current object. The object's new value is returned;

Some of the free member functions have names ending in_explicit. The_explicit functions define an additional parameter `memory_orderorder', which is not available for the non-_explicit functions (e.g.,atomic_load(atomic<Type> *ptr) andatomic_load_explicit(atomic<Type>*ptr, memory_order order))

Here are the free functions that are available for all atomic types:

bool std::atomic_compare_exchange_strong(_explicit)(std::atomic<Type> *ptr, Type *oldValue, Type newValue) noexept:
returnsptr->compare_exchange_strong(*oldValue, newValue);
bool std::atomic_compare_exchange_weak(_explicit)(std::atomic<Type> *ptr, Type *oldValue, Type newValue) noexept:
returnsptr->compare_exchange_weak(*oldValue, newValue);
Type std::atomic_exchange(_explicit)(std::atomic<Type> *ptr, Type newValue) noexept:
returnsptr->exchange(newValue);
void std::atomic_init(std::atomic<Type> *ptr, Type init) noexept:
Storesinitnon-atomically in*ptr. The object pointed to byptr must have been default constructed, and as yet no member functions must have been called for it. This function has nomemory_order parameter;
bool std::atomic_is_lock_free(std::atomic<Type> const *ptr) noexept:
returnsptr->is_lock_free(). This function has nomemory_order parameter;
Type std::atomic_load(_explicit)(std::atomic<Type> *ptr) noexept:
returnsptr->load();
void std::atomic_store(_explicit)(std::atomic<Type> *ptr, Type value) noexept:
callsptr->store(value).

In addition to the abovementioned free functionsatomic<Integral> typesalso offer the following free member functions:

Integral std::atomic_fetch_add(_explicit)(std::atomic<Integral> *ptr, Integral value) noexcept:
returnsptr->fetch_add(value);
Integral std::atomic_fetch_sub(_explicit)(std::atomic<Integral> *ptr, Integral value) noexcept:
returnsptr->fetch_sub(value);
Integral std::atomic_fetch_and(_explicit)(std::atomic<Integral> *ptr, Integral mask) noexcept:
returnsptr->fetch_and(value);
Integral std::atomic_fetch_or(_explicit)(std::atomic<Integral> *ptr, Integral mask) noexcept:
returnsptr->fetch_or(value);
Integral std::atomic_fetch_xor(_explicit)(std::atomic<Integral> *ptr, Integral mask) noexcept:
returnsptr->fetch_xor(mask).

20.6: An example: threaded quicksort

The quicksort sorting algorithm (Hoare, 1962) is a well-known sortingalgorithm. Given an array ofn elements, it works like this:

Pick an element from the array, and partition the array with respect to this element (call it thepivot element) (in the example below, assume a functionpartition performing the partition is available). This leaves us with two (possibly empty) sub-arrays: one to the left of the pivot element, and one to the right of the pivot element;
Recursively perform quicksort on the left-hand sub-array;
Recursively perform quicksort on the right-hand sub-array.

To convert this algorithm to a multi-threaded algorithm appears to be be asimple task:

    void quicksort(Iterator begin, Iterator end)    {        if (end - begin < 2)            // less than 2 elements are left            return;                     // and we're done        Iter pivot = partition(begin, end); // determine an iterator pointing                                            // to the pivot element        thread lhs(quicksort, begin, pivot);// start threads on the left-hand                                            // side sub-arrays        thread rhs(quicksort, pivot + 1, end);  // and on the right-hand side                                                // sub-arrays        lhs.join();        rhs.join();                         // and we're done    }

Unfortunately, this translation to a multi-threaded approach won't work for reasonably large arrays because of a phenomenon calledoverpopulation:more threads are started than the operating system is prepared to give us. Inthose cases aResource temporarily unavailable exception is thrown, andthe program ends.

Overpopulation can be avoided by using apool of workers, where each`worker' is a thread, which in this case is responsible for handling one (sub)array, but not for the nested calls. The pool of workers is controlled by ascheduler, receiving the requests to sort sub-arrays, and passing theserequests on to the next available worker.

The main data structure of the example program developed in this section is aqueue ofstd::pairs containing iterators of the array to be sorted(cf. Figure26, the sources of the program are found in theC++ Annotations'syo/threading/examples/multisort directory). Two queues are being used: onequeue is a task-queue, receiving the iterators of sub-arrays to bepartitioned. Instead of immediately launching new threads (thelhs andrhs threads in the above example), the ranges to be sorted are pushed onthe task-queue. The other queue is the work-queue: elements are moved from thetask-queue to the work-queue, where they will be processed by one of theworker threads.

Figure 26: Data structure used for multi-threading quicksort

The program'smain function starts the workforce, reads the data, pushesthe arraysbegin andend iterators on the task queue and then startsthe scheduler. Once the scheduler ends the sorted array is displayed:

    int main()    {        workForce();            // start the worker threads        readData();             // read the data into vector<int> g_data        g_taskQ.push(           // prepare the main task                    Pair(g_data.begin(), g_data.end())                );         scheduler();            // sort g_data        display();              // show the sorted elements    }

The workforce consists of a bunch of detached threads. Each thread representsa worker, implemented in the functionvoid worker. Since the number ofworker threads is fixed, overpopulation doesn't occur. Once the array has beensorted and the program stops these detached threads simply end:

    for (size_t idx = 0; idx != g_sizeofWorkforce; ++idx)        thread(worker).detach();

The scheduler continues for as long as there are sub-arrays to sort. When thisis the case the task queue's front element is moved to the work queue. Thisreduces the work queue's size, and prepares an assignment for the nextavailable worker. The scheduler now waits until a worker is available. Once workers are available one of them is informed of the waiting assignment, andthe scheduler waits for the next task:

    void scheduler()    {        while (newTask())        {            g_workQ.rawPushFront(g_taskQ);                g_workforce.wait();           // wait for a worker to be available            g_worker.notify_all();            // activate a worker        }    }

The functionnewTask simply checks whether the task queue is empty. If so,and none of the workers is currently busy sorting a sub-array then the arrayhas been sorted, andnewTask can returnfalse. When the task queue isempty but a worker is still busy, it may be that new sub-array dimensions aregoing to be placed on the task queue by an active worker. Whenever a worker isactive theSemaphore g_workforce's size is less than the size of the workforce:

    bool wip()    {        return g_workforce.size() != g_sizeofWorkforce;    }

    bool newTask()    {        bool done;            unique_lock<mutex> lk(g_taskMutex);        while ((done = g_taskQ.empty()) && wip())            g_taskCondition.wait(lk);            return not done;    }

Each detached worker thread performs a continuous loop. In the loop it waitsfor a notification by the scheduler. Once it receives a notification itretrieves its assignment from the work queue, and partitions the sub-arrayspecified in its assignment. Partitioning may result in new tasks. Once thishas been completed the worker has completed its assignment: it increments theavailable workforce and notifies the scheduler that it should check whetherall tasks have been performed:

    void worker()    {        while (true)        {            g_worker.wait();      // wait for action                partition(g_workQ.popFront());            g_workforce.notify_all();                lock_guard<mutex> lk(g_taskMutex);            g_taskCondition.notify_one();        }    }

Sub-arrays smaller than two elements need no partitioning. All largersub-arrays are partitioned relative to their first element. Thestd::partition generic algorithm does this well, but if the pivot isitself an element of the array to partition then the pivot's eventual locationis undetermined: it may be found anywhere in the series of elements which areat least equal to the pivot. The two required sub-arrays, however, can easilybe constructed:

First callstd::partition relative to an array's first element, partitioning the array's remaining elements, returningmid, pointing to the first element of the series of elements that are at least as large as the array's first element;
Then swap the array's first element with element to whichmid - 1 points;
The two sub-arrays range from, respectively,array.begin() tomid - 1 (elements all smaller than the pivot), and frommid toarray.end() (elements all at least as large as the pivot).

The two iterator pairs defining these two sub-arrays are thereupon addedto the task queue, creating two new tasks to be dealt with by the scheduler:

    void partition(Pair const &range)    {        if (range.second - range.first < 2)            return;            auto rhsBegin = partition(range.first + 1, range.second,                                      [=](int value)                                      {                                          return value < *range.first;                                      }                                  );        auto lhsEnd = rhsBegin - 1;            swap(*range.first, *lhsEnd);            pushTask(range.first, lhsEnd);        pushTask(rhsBegin, range.second);    }

20.7: Shared States

Just before a thread ends it may have produced some results. These results mayhave to to be communicated to other threads. In multi threaded programsseveral classes and functions can be used that produceshared states, making it easy to communicate resultsto other threads. Results could be values, objects or exceptions.

Objects that contain such shared states are calledasynchronous return objects. However,due to the nature of multi threading, a thread may request the results of anasynchronous return object before these result are actually available. Inthose cases the requesting thread blocks, waiting for the results to becomeavailable. Asynchronous return objects offerwait andget memberswhich, respectively,wait until the results have become available, andproduce the asynchronous results once they are available. The phrase thatis used to indicate that the results are available is `the shared state hasbeen made ready'.

Shared states are made ready byasynchronous providers. Asynchronousproviders are simply objects or functions providing results to sharedstates. Making a shared state ready means that an asynchronous provider

marks its shared state as being ready, and
unblocks any waiting threads (e.g., by allowing blocking members, likewait, to return).

Once a shared state has been made ready it contains a value, object, orexception which can be retrieved by objects having access to the sharedstate. While code is waiting for a shared state to become ready the value orexception that is going to be stored in the shared state may be computed. Whenmultiple threads try to access the same shared state they must usesynchronizing mechanisms (like mutexes, cf. section20.2) to preventaccess-conflicts.

Shared states use reference counting to keep track of the number ofasynchronous return objects or asynchronous providers that hold references tothem. These return objects and providers may release their references to theseshared states (which is called `releasing theshared state). This happens when a return object or provider holds the lastreference to the shared state, and the shared state is destroyed.

On the other hand, an asynchronous provider may alsoabandon its shared state. In that case theprovider, in sequence,

stores an exception object of typestd::future_error, holding the error conditionstd::broken_promise in its shared state;
makes its shared data ready; and
releases its shared data.

Objects of the classstd::future (see the next section) are asynchronousreturn objects. They can be produced by thestd::async (section20.10) family of functions, and by objects of the classesstd::packaged_task (section20.11), andstd::promise (section20.12).

20.8: Asynchronous return objects: std::future

Condition variables allow threads to wait until data have obtained certainvalues. A thread may also have to wait until a sub-thread has finished whencalling a sub-thread'sjoin member.

Waiting may be unwelcome: instead of just waiting our thread might also bedoing something useful. It might as well pick up the results produced by asub-thread at some point in the future.

In fact, exchanging data among threads always poses some difficulties, as itrequires shared variables, and the use of locks and mutexes to prevent datacorruption. Rather than waiting and using locks it would be nice if someasynchronous task could be started, allowing the initiating thread (or evenother threads) to pick up the result at some point in the future, when theresults are needed, without having to worry about data locks or waiting times.For situations like theseC++ provides the classstd::future.

Before using the classstd::future the<future> header filemust be included.

Objects of the class templatestd::future harbor the resultsproduced by asynchronously executed tasks. The classstd::future is aclass template. Its template type parameter specifies the type of the resultreturned by the asynchronously executed task. This type may bevoid.

On the other hand, the asynchronously executed task may throw an exception(ending the task). In that case thefuture object catches the exception,and rethrows it once its return value (i.e., the value returned by theasynchronously executed task) is requested.

In this section the members of the class templatefuture aredescribed.Future objects are commonly initialized through anonymousfuture objects returned by the factory functionstd::async or by theget_future members of the classesstd::promise, andstd::packaged_task (introduced in upcoming sections). Examples of the useofstd::future objects are provided in those sections.

Some offuture's members return a value of the strongly typedenumerationstd::future_status. This enumeration definesthree symbolic constants:future_status::ready, future_status::timeout,andfuture_status::deferred.

Error conditions are returned throughstd::future_errorexceptions. These error conditions are represented by the values of thestrongly typed enumerationstd::future_errc (covered in the next section).

The classfuture itself provides the following constructors:

future():
The default constructor constructs anfuture object that does not refer to shared results. Itsvalid member returnsfalse.
future(future &&tmp) noexcept:
The move constructor is available. Itsvalid member returns whattmp.valid() would haved returned prior to the constructor invocation. After calling the move constructortmp.valid() returnsfalse.

The classfuture does not offer a copy constructor or an overloadedassignment operator.

Here are the members of the classstd::future:

future &operator=(future &&tmp):
The move assignment operator grabs the information from thetmp object; following this,tmp.valid() returnsfalse.
std::shared_future<ResultType> share() &&:
Returns astd::shared_future<ResultType> (see section20.9). After calling this function, thefuture's valid member returnsfalse.
ResultType get():
Firstwait (see below) is called. Oncewait has returned the results produced by the associated asynchronous task are returned. Withfuture<Type> specifications the returned value is the moved shared value ifType supports move assignment, otherwise a copy is returned. Withfuture<Type &> specifications aType & is returned, withfuture<void> specifications nothing is returned. If the shared value is an exception, it is thrown instead of returned. After calling this member thefuture object'svalid member returnsfalse.
bool valid() const:
Returnstrue if the (future) object for whichvalid is called refers to an object returned by an asynchronous task. Ifvalid returnsfalse, thefuture object exists, but in addition tovalid only its destructor and move constructor can safely be called. When other members are called whilevalid returnsfalse astd::future_error exception is thrown (having the valuefuture_errc::no_state).
void wait() const:
The thread is blocked until the results produced by the associated asynchronous task are available.
std::future_status wait_for(chrono::duration<Rep, Period> const &rel_time) const:
This member template derives the template typesRep andPeriod from the actually specified duration (cf. section4.2.2). If the results contain a deferred function nothing happens. Otherwisewait_for blocks until the results are available or until the amount of time specified byrel_time has expired. Possible return values are:
- future_status::deferred if the results contains a deferred function;
- future_status::ready if the results are available;
- future_status::timeout if the function is returning because the amount of time specified byrel_time has expired.
future_status wait_until(chrono::time_point<Clock, Duration> const &abs_time) const:
This member template derives the template typesClock andDuration from the actually specifiedabs_time (cf. section4.2.4). If the results contain a deferred function nothing happens. Otherwisewait_until blocks until the results are available or until the point in time specified byabs_time has expired. Possible return values are:
- future_status::deferred if the results contain a deferred function;
- future_status::ready if the results are available;
- future_status::timeout if the function is returning because the point in time specified byabs_time has expired.

The classstd::future<ResultType> declares the following friends:

    std::promise<ResultType>

(sf. section20.12), and

    template<typename Function, typename... Args>        std::future<typename result_of<Function(Args...)>::type>         std::async(std::launch, Function &&fun, Args &&...args);

(cf. section20.10).

20.8.1: The std::future_error exception and the std::future_errc enum

Members of the classstd::future may return errors by throwingstd::future_error exceptions. These error conditions are represented bythe values of the strongly typed enumerationstd::future_errc which defines the following symbolicconstants:

broken_promise
Broken_promise is thrown when afuture object was received whose value was never assigned by apromise orpackaged_task. For example, an object of the classpromise<int> should set the value of thefuture<int> object returned by itsget_future member (cf. section20.12), but if it doesn't do so, then abroken_promise exception is thrown, as illustrated by the following program:
```
 1: std::future<int> fun() 2: { 3:     return std::promise<int>().get_future(); 4: } 5:  6: int main() 7: try 8: { 9:     fun().get();10: }11: catch (std::exception const &exc)12: {13:     std::cerr << exc.what() << '\n';14: }
```
At line 3 apromise object is created, but its value is never set. Consequently, it `breaks its promise' to produce a value: whenmain tries to retrieve its value (in line 9) astd::futue_error exception is thrown containing thefuture_errc::broken_promise value
future_already_retrieved
Future_already_retrieved is thrown when multiple attempts are made to retrieve thefuture object from, e.g., apromise orpackaged_task object that (eventually) should be ready. For example:
```
 1: int main() 2: { 3:     std::promise<int> promise; 4:     promise.get_future(); 5:     promise.get_future(); 6: }
```
Note that after defining thestd::promise object in line 3 it has merely been defined: no value is ever assigned to itsfuture. Even though no value is assigned to thefuture object, itis a valid object. I.e., after some time the futureshould be ready, and the future'sget member should produce a value. Hence, line 4 succeeds, but then, in line 5, the exception is thrown as `the future has already been retrieved'.
promise_already_satisfied
Promise_already_satisfied is thrown when multiple attempts are made to assign a value to apromise object. Assigning a value orexception_ptr to thefuture of apromise object may happen only once. For example:
```
 1: int main() 2: { 3:     std::promise<int> promise; 4:     promise.set_value(15); 5:     promise.set_value(155); 6: }
```
no_state
No_state is thrown when a member function (other thanvalid, see below) of afuture object is called when itsvalid member returnsfalse. This happens, e.g., when calling members of a default constructedfuture object.No_state is not thrown forfuture objects returned by theasync factory function or returned by theget_future members ofpromise orpackaged_task type of objects. Here is an example:
```
 1: int main() 2: { 3:     std::future<int> fut; 4:     fut.get(); 5: }
```

The classstd::future_error is derived from the classstd::exception, and offers, in addition to thechar const *what()const member also the memberstd::error_code const &code() const,returning anstd::error_code object associatedwith the thrown exception.

20.9: Shared asynchronous return objects: std::shared_future

When a thread activates an asynchronous provider (e.g., astd::async) thenthe return value of the asynchronously called function becomes available inits activating thread through astd::future object. Thefuture object cannot be used by another thread. If this is required (e.g.,see this chapter's final section) thefuture object must be converted to astd::shared_future object.

Before using the classstd::shared_future the<future> header filemust be included.

Once ashared_future object is available, itsget member (see below)can repeatedly be called to retrieve the results of the originalfutureobject. This is illustrated by the next small example:

     1: int main()     2: {     3:     std::promise<int> promise;     4:     promise.set_value(15);     5:      6:     auto fut = promise.get_future();     7:     auto shared1 = fut.share();     8:      9:     std::cerr << "Result: " << shared1.get() << "\n"    10:                  "Result: " << shared1.get() << "\n"    11:                  "Valid: " << fut.valid() << '\n';    12:     13:     auto shared2 = fut.share();    14:     15:     std::cerr << "Result: " << shared2.get() << "\n"    16:                  "Result: " << shared2.get() << '\n';    17: }

In lines 9 and 10 thepromise's results are retrieved multiple times,but having obtained theshared_future in line 7, the originalfutureobject no longer has an associated shared state. Therefore, when anotherattempt is made (in line 13) to obtain theshared_future, anoassociated state exception is thrown and the program aborts.

However, multiple copies ofshared_future objects may co-exist. Whenmultiple copies ofshared_future objects exist (e.g. in differentthreads), the results of the associated asynchronous task are made ready(become available) at exactly the same moment in time.

The relationship between the classesfuture andshared_futureresembles the relationship between the classesunique_ptr andshared_ptr: there can only be one instance of aunique_pointer,pointing to data, whereas there can be many instances of ashared_pointer,each pointing to the same data.

The effect of calling any member of ashared_future object for whichvalid() == false other than the destructor, the move-assignment operator,orvalid is undefined.

The classshared_future supports the following constructors:

shared_future() noexcept
an emptyshared_future object is constructed that does not refer to shared results. After using this constructor the object'svalid member returnsfalse.
shared_future(shared_future const &other)
ashared_future object is constructed that refers to the same results asother (if any). After using this constructor the object'svalid member returns the same value asother.valid().
shared_future(shared_future<Result> &&tmp) noexcept
Effects: move constructs a shared_future object that refers to the results that were originally referred to bytmp (if any). After using this constructor the object'svalid member returns the same value astmp.valid() returned prior to the constructor invocation, andtmp.valid() returnsfalse.
shared_future(future<Result> &&tmp) noexcept
Effects: move constructs a shared_future object that refers to the results that were originally referred to bytmp (if any). After using this constructor the object'svalid member returns the same value astmp.valid() returned prior to the constructor invocation, andtmp.valid() returnsfalse.

The class's destructor destroys theshared_future object for which it iscalled. If the object for which the destructor is called is the lastshared_future object, and nostd::promise orstd::packaged_task is associated with the results associatedwith the current object, then the results are also destroyed.

Here are the members of the classstd::shared_future:

shared_future& operator=(shared_future &&tmp):
The move assignment operator releases the current object's shared results, and move assignstmp's results to the current object. After calling the move assignment operator the current object'svalid member returns the same value astmp.valid() returned prior to the invocation of the move assignment operator, andtmp.valid() returnsfalse;
shared_future& operator=(shared_future const &rhs):
The assignment operator releases the current object's shared results, andrhs's results are shared with the current object. After calling the assignment operator the current object'svalid member returns the same value astmp.valid();
Result const &shared_future::get() const:
(Specializations forshared_future<Result &> andshared_future<void> are also available). This member waits until the shared results are available, and subsequently returnsResult const &. Note that access to the data stored inResults, accessed throughget is not synchronized. It is the responsibility of the programmer to avoid race conditions when accessingResult's data. IfResult holds an exception, it is thrown whenget is called;
bool valid() const:
Returnstrue if the current object refers to shared results;
void wait() const:
Blocks until shared results are available (i.e., the associated asynchronous task has produced results);
future_status wait_for(const chrono::duration<Rep, Period>& rel_time) const:
(The template typesRep andPeriod normally are derived by the compiler from the actualrel_time specification.) If the shared results contain a deferred function (cf. section20.10) nothing happens. Otherwisewait_for blocks until the results of the associated asynchronous task has produced results, or until the relative time specified byrel_time has expired. The member returns
- future_status::deferred if the shared results contain a deferred function;
- future_status::ready if the shared results are available;
- future_status::timeout if the function is returning because the amount of time specified byrel_time has expired;
future_status wait_until(const chrono::time_point<Clock, Duration>& abs_time) const:
(The template typesClock andDuration normally are derived by the compiler from the actualabs_time specification.) If the shared results contain a deferred function nothing happens. Otherwisewait_until blocks until the shared results are available or until the point in time specified byabs_time has expired. Possible return values are:
- future_status::deferred if the shared results contain a deferred function;
- future_status::ready if the shared results are available;
- future_status::timeout if the function is returning because the point in time specified byabs_time has expired.

20.10: Starting a new thread: std::async

In this section the function templatestd::async iscovered.Async is used to start asynchronous tasks, returning values (orvoid) to the calling thread, which is hard to realize merely using thestd::thread class.

Before using the functionasync the<future> header file must beincluded.

When starting a thread using the facilities of the classstd::thread theinitiating thread at some point commonly calls the thread'sjoinmethod. At that point the thread must have finished or execution blocks untiljoin returns. While this often is a sensible course of action, it may notalways be: maybe the function implementing the thread has a return value, orit could throw an exception.

In those casesjoin cannot be used: if an exception leaves a thread, thenyour program ends. Here is an example:

     1: void thrower()     2: {     3:     throw std::exception();     4: }     5:      6: int main()     7: try     8: {     9:    std::thread subThread(thrower);    10: }    11: catch (...)    12: {    13:     std::cerr << "Caught exception\n";    14: }

In line 3thrower throws an exception, leaving the thread. Thisexception is not caught bymain's try-block (as it is defined in anotherthread). As a consequence, the program terminates.

This scenario doesn't occur whenstd::async is used.Async may start anew asynchronous task, and the activating thread may retrieve the return valueof the function implementing the asynchronous task or any exception leavingthat function from astd::future object returned by theasyncfunction. Basically,async is called similarly to the way a thread isstarted usingstd::thread: it is passed a function and optionallyarguments which are forwarded to the function.

Although the function implementing the asynchronous task may be passed asfirst argument,async's first argument may also be a value of the stronglytyped enumeration std::launch:

    enum class launch    {        async,        deferred    };

When passinglaunch::async the asynchronous task immediately starts;when passinglaunch::deferred the asynchronous task is deferred. Whenstd::launch is not specified the default valuelaunch::async |launch::deferred is used, giving the implementation freedom of choice, usually resulting in deferring execution of the asynchronous task.

So, here is the first example again, this time usingasync to start thesub-thread:

     1: bool fun()     2: {     3:     return std::cerr << "    hello from fun\n";     4: }     5: int exceptionalFun()     6: {     7:     throw std::exception();     8: }     9:     10: int main()    11: try    12: {    13:     auto fut1 = std::async(std::launch::async, fun);    14:     auto fut2 = std::async(std::launch::async, exceptionalFun);    15:     16:     std::cerr << "fun returned " << std::boolalpha << fut1.get() << '\n';    17:     std::cerr << "exceptionalFun did not return " << fut2.get() << '\n';    18: }    19: catch (...)    20: {    21:     std::cerr << "caught exception thrown by exceptionalFun\n";    22: }

Now the threads immediately start, but although the results areavailable around line 13, the thrown exception isn't terminating theprogram. The first thread's return value is made available in line 16, theexception thrown by the second thread is simply caught by main's try-block(line 19).

The function templateasync has several overloaded versions:

The basic form expects a function or functor as its first argument,returning astd::future holding the function's return value or exceptionthrown by the function:

    template <typename Function, class ...Args>    std::future<        typename std::result_of< Function(Args ...) >::type    > std::async(Function &&fun, Args &&...args);

Alternatively, the first argument may be the address of a memberfunction. In that case the (required) second argument is an object (or apointer to an object) of that member function's class. Any remaining argumentsare passed to the member function (see also the remarks below).

The first argument may also be a combination (using thebit_oroperator) of the enumeration values of thestd::launch enumeration:

    template <class Function, class ...Args>    std::future<typename std::result_of<Function(Args ...)>::type>         std::async(std::launch policy, Function &&fun, Args &&...args);

If the first argument specifiesstd::launch values, the secondargument may also be the address of a member function. In that case the(required) third argument is an object (or a pointer to an object) of thatmember function's class. Any remaining arguments are passed to the memberfunction (see also the remarks below).

When callingasync all arguments except for thestd::launchargument must be references, pointers or move-constructible objects:

When a member function is specified, then the object for which the member function is called must be a named object, an anonymous object, or a pointer to a named object.
When a named object is passed to theasync function template then copy construction is used to construct a copy of the argument which is then forwarded to the thread-launcher.
When an anonymous object is passed to theasync function template then move construction is used to forward the anonymous object to the thread launcher.

Once the thread itself starts another move construction is used to constructan object for the duration of the thread. When a pointer to an object ispassed, the sub-thread uses the object referred to by the pointer, and neithercopy- nor move-construction is required. However, when using a pointer to anobject the programmer should make sure that the object's lifetime exceeds theduration of the thread (note that this is not automatically guaranteed, as theasynchronous task may not actually start before the future'sget member iscalled).

Because of the defaultstd::launch::deferred | std::launch::async argumentused by the basicasync call it is likely that the function which ispassed toasync doesn't immediately start. Thelaunch::deferred policyallows the implementor to defer its execution until the program explicitlyasks for the function's results. Consider the following program:

     1: void fun()     2: {     3:     std::cerr << "    hello from fun\n";     4: }     5:      6: std::future<void> asyncCall(char const *label)     7: {     8:     std::cerr << label << " async call starts\n";     9:     auto ret = std::async(fun);    10:     std::cerr << label << " async call ends\n";    11:     return ret;    12: }    13:     14: int main()    15: {    16:     asyncCall("First");    17:     asyncCall("Second");    18: }

Althoughasync is called in line 9, the program's output may not showfun's output line when it is run. This is a result of the (default) useoflauch::deferred: the system simply defersfun's execution untilrequested, which doesn't happen. But thefuture object that's returned byasync has a memberwait. Oncewait returns the shared state mustbe available. In other words:fun must have finished. Here is what happenswhen after line 9 the lineret.wait() is inserted:

    First async call starts        hello from fun    First async call ends    Second async call starts        hello from fun    Second async call ends

Actually, evaluation offun can be requested at the point where weneed its results, maybe even after callingasyncCall, as shown in the nextexample:

     1: int main()     2: {     3:     auto ret1 = asyncCall("First");     4:     auto ret2 = asyncCall("Second");     5:      6:     ret1.get();     7:     ret2.get();     8: }

Here theret1 andret2 std::future objects are created, but theirfun functions aren't evaluated yet. Evaluation occurs at lines 6 and 7,resulting in the following output:

    First async call starts    First async call ends    Second async call starts    Second async call ends        hello from fun        hello from fun

Thestd::async function template is used to start a thread, making itsresults available to the calling thread. On the other hand, we may only beable toprepare (package) a task (a thread), but may have to leave thecompletion of the task to another thread. Scenarios like this are realizedthrough objects of the classstd::packaged_task, which is the topic of thenext section.

20.11: Preparing a task for execution: std::packaged_task

The class templatestd::packaged_task allows a program to`package' a function or functor and pass the package to a thread for furtherprocessing. The processing thread then calls the packaged function, passing itits arguments (if any). After completing the function thepackaged_task'sfuture is ready, allowing the program to retrieve the results produced bythe function. Thus, functions and the results of function calls can betransferred between threads.

Before using the class templatepackaged_task the<future> header filemust be included.

Before describing the class's interface, let's first look at an example to getan idea about how apackaged_task can be used. Remember that the essenceofpackaged_task is that part of your program prepares (packages) a taskfor another thread to complete, and that the program at some point needs theresult of the completed task.

To clarify what's happening here, let's first look at a real-lifeanalogon. Every now and then I make an appointment with my garage to have mycar serviced. The `package' in this case are the details about my car: itsmake and type determine the kind of actions my garage performs when servicingit. My neighbor also has a car, which also needs to be serviced every now andthen. This also results in a `package' for the garage. At the appropriate timeme and my neighbor take our cars to the garage (i.e., the packages are passedto another thread). The garage services the cars (i.e., calls the functionsstored in thepackaged_tasks [note that the tasks differ, depending on thetypes of the cars]), and performs some actions that are associated with it(e.g., registering that my or my neighbor's car has been serviced, or orderreplacement parts). In the meantime my neighbor and I perform our ownbusinesses (the program continues while a separate thread runs as well). Butby the end of the day we'd like to use our cars again (i.e., get the resultsassociated with thepackaged_task). A common result in this example is thegarage's bill, which we have to pay (the program obtains thepackaged_task's results).

Here is a littleC++ program illustrating the use of apackaged_task(assuming the required headers andusing namespace std have beenspecified):

     1: mutex carDetailsMutex;     2: condition_variable condition;     3: string carDetails;     4: packaged_task<size_t (std::string const &)> serviceTask;     5:      6: size_t volkswagen(string const &type)     7: {     8:     cout << "performing maintenance by the book for a " << type << '\n';     9:     return type.size() * 75;            // the size of the bill    10: }    11:     12: size_t peugeot(string const &type)    13: {    14:     cout << "performing quick and dirty maintenance for a " << type << '\n';    15:     return type.size() * 50;             // the size of the bill    16: }    17:     18: void garage()    19: {    20:     while (true)    21:     {    22:         unique_lock<mutex> lk(carDetailsMutex);    23:         while (carDetails.empty())    24:             condition.wait(lk);    25:     26:         cout << "servicing a " << carDetails << '\n';    27:         serviceTask(carDetails);    28:         carDetails.clear();    29:     }    30: }    31:     32: int main()    33: {    34:     thread(garage).detach();    35:     36:     while (true)    37:     {    38:         string car;    39:         if (not getline(cin, car) || car.empty())    40:             break;    41:         {    42:             lock_guard<mutex> lk(carDetailsMutex);    43:             carDetails = car;    44:         }    45:         serviceTask =  packaged_task<size_t (string const &)>(    46:                     car[0] == 'v' ? volkswagen : peugeot    47:                 );    48:         auto bill = serviceTask.get_future();    49:         condition.notify_one();    50:         cout << "Bill for servicing a " << car <<    51:                                 ": EUR " << bill.get() << '\n';    52:     }    53: }

Lines 1-3 define the variables used for synchronization;
Line 4 defines apackaged_task: serviceTask is initialized with a function (or functor) expecting astring, returning asize_t;
Lines 6-10 and 12-16 define such functions:volkswagen andpeugeot represent the tasks to perform when cars of the provided types come in for service; presumably they return the bill.
Lines 18-30 define the functionvoid garage, defining the actions performed by the garage when cars come in for service. These actions are performed by a separate detached thread, starting in line 34. In a continuous loop it waits until it obtains a lock on thecarDetailsMutex andcarDetails is no longer empty. Then, at line 27, it passescarDetails to thepackaged_task `serviceTask'. By itself this is not identical to calling thepackaged_task's function, but eventually its function will be called. At this point thepackaged_task receives its function's arguments, which it eventually will forward to its configured function. Finally, at line 28 it clearscarDetails, thus preparing itself for the next request.
Lines 32-53 definemain:
- First, at line 34 the anonymous detached thread runninggarage is started.
Then the program's main loop starts (lines 36-52):
- The main thread reads commands from the standard input until an empty or no line is received (lines 38-40).
- By convention the line's first letter starts the car's brand (volkswagen orpeugeot), and thepackaged_task, provided with the right servicing function, is constructed next (line 45).
- Then, at line 48 the results, stored in afuture, are retrieved. Although at this point thefuture might not be ready, thefuture object itselfis, and it is simply returned as the bill.
- Now we're ready to inform the garage that it can service a car: the garage is notified in line 49.
Anything may happen next: the program may perform any actions, but eventually it requests the results produced by the garage.
- The main thread obtains the results by callingbill.get() in line 51. If, by this time, the car is still being serviced, the bill isn't ready yet, andbill.get() blocks until it is, and the bill for servicing a car is shown.

Now that we've seen an example of a program using apackaged_task,let's have a look at its interface. Note that the classpackaged_task is aclass template: its template type parameter specifies the prototype of afunction or function object implementing the task performed by thepackaged_task object.

Constructors and destructor:

packaged_task() noexcept:
The default constructor constructs apackaged_task object which is not associated with a function or shared state;
explicit packaged_task<ReturnType(Args...)> task(fun):
Apackaged_task is constructed for a function or functorfun expecting arguments of typesArgs..., and returning a value of typeReturnType. Thepackaged_task class template specifiesReturnType (Args...) as its template type parameter. The constructed object contains a shared state, and a (move constructed) copy offunction.
Optionally anAllocator may be specified as second template type parameter, in which case the first two arguments arestd::allocator_arg_t, Allocator const &alloc. The typestd::allocator_arg_t is a type introduced to disambiguate constructor selections, and can simply be specified asstd::allocator_arg_t().
This constructor may throw astd::bad_alloc exception or exceptions thrown byfunction's copy or move constructors.
packaged_task(packaged_task &&tmp) noexcept:
The move constructor moves any existing shared state fromtmp to the newly constructed object, removing the shared state fromtmp.
~packaged_task():
The object's shared state (if any) is abandoned

Member functions:

future<ReturnType> get_future():
Astd::future object is returned holding the results of the separately executed thread. Whenget_future is incorrectly called afuture_error exception is thrown, containing one of the following values:
- future_already_retrieved ifget_future was already called on apackaged_task object containing the same shared state as the current object;
- no_state if the current object has no shared state.
Note: Anyfutures that share the object's shared state may access the result returned by the object's task.
void make_ready_at_thread_exit(Args... args):
Callsvoid operator()(Args... args) (see below) when the current thread exits, once all objects of thread storage duration associated with the current thread have been destroyed.
packaged_task &operator=(packaged_task &&tmp):
The move assignment operator first releases the current object's shared state (if available), after which the current object andtmp are swapped;
void operator()(Args... args):
Theargs arguments are forwarded to the current object's stored task. When the stored task returns its return value is stored in the current object's shared state. Otherwise any exception thrown by the task is stored in the object's shared state. Following this the object's shared state is made ready, and any threads blocked in a function waiting for the object's shared state to become ready are unblocked. Afuture_error exception is thrown upon error, containing
- promise_already_satisfied if the shared state has already been made ready;
- no_state if the current object does not have any shared state.
Calling this member synchronizes with calling any member function of a(shared_)future object that provides access to thepackaged_task's results.
void reset():
Abandons any available shared state, initializing the current object topackaged_task(std::move(funct)), wherefunct is the object's stored task. This member may throw the following exceptions:
- bad_alloc if memory for the new shared state could not be allocated;
- any exception thrown by the move constructor of the task stored in the shared state;
- future_error with ano_state error condition if the current object contains no shared state.
void swap(packaged_task &other) noexcept:
The shared states and stored tasks of the current object and other are swapped.
bool valid() const noexcept:
Returnstrue if the current object contains a shared state, otherwisefalse is returned;

The following non-member (free) function operating onpackaged_taskobjects is available:

void swap(packaged_task<ReturnType(Args...)> &lhs,packaged_task<ReturnType(Args...)> &rhs) noexcept
Callslhs.swap(rhs)

20.12: The class `std::promise'

In addition tostd::packaged_task andstd::async the class templatestd::promise can be used to obtain the results from aseparate thread.

Before using the class templatepromise the<future> header filemust be included.

Apromise is used to obtain the results from another thread withoutfurther synchronization requirements. Consider the following program:

    void compute(int *ret)    {        *ret = 9;    }        int main()    {        int ret = 0;        std::thread(compute, &ret).detach();        cout << ret << '\n';    }

Chances are that this program shows the value 0: thecout statement hasalready been executed before the detached thread has had a chance to completeits work. In this example that problem can easily be solved by using anon-detached thread, and using the thread'sjoin member, but when multiplethreads are used that requires named threads and as manyjoincalls. Instead, using apromise might be preferred:

     1: void compute(promise<int> &ref)     2: {     3:     ref.set_value(9);     4: }     5:      6: int main()     7: {     8:     std::promise<int> prom;     9:     std::thread(compute, ref(prom)).detach();    10:     11:     cout << prom.get_future().get() << '\n';    12: }

This example also uses a detached thread, but its results are kept forfuture reference in apromise object, instead of directly being assignedto a final destination variable. Thepromise object contains afutureobject holding the computed value. Thefuture's get member blocks untilthe future has been made ready, at which point the result becomesavailable. By then the detached thread may or may not yet have beencompleted. If it already completed its work thenget immediately returns,otherwise there will be a slight delay.

Promises are useful when implementing a multi threaded version of somealgorithm without having to use additional synchronization statements. As anexample consider matrix multiplications. Each element of the resultingproduct matrix is computed as the inner product of two vectors: the innerproduct of a row of the left-hand matrix operand and a column of theright-hand matrix operand becomes element[row][column] of the resultingmatrix. Since each element of the resulting matrix can independently becomputed from the other elements, a multi threaded implementation is wellpossible. In the following example the functioninnerProduct (lines 4..11)leaves its result in apromise object:

     1: int m1[2][2] = {{1, 2}, {3, 4}};     2: int m2[2][2] = {{3, 4}, {5, 6}};     3:      4: void innerProduct(promise<int> &ref, int row, int col)     5: {     6:     int sum = 0;     7:     for (int idx = 0; idx != 2; ++idx)     8:         sum += m1[row][idx] * m2[idx][col];     9:     10:     ref.set_value(sum);    11: }    12:     13: int main()    14: {    15:     promise<int> result[2][2];    16:     17:     for (int row = 0; row != 2; ++row)    18:     {    19:         for (int col = 0; col != 2; ++col)    20:             thread(innerProduct, ref(result[row][col]), row, col).detach();    21:     }    22:     23:     for (int row = 0; row != 2; ++row)    24:     {    25:         for (int col = 0; col != 2; ++col)    26:             cout << setw(3) << result[row][col].get_future().get();    27:         cout << '\n';    28:     }    29: }

Each inner product is computed by a separate (anonymous and detached)thread (lines 17..21), which starts as soon as the run-time system allows itto start. By the time the threads have finished the resulting inner productscan be retrieved from the promises' futures. Since futures'get membersblock until their results are actually available, the resulting matrix cansimply be displayed by calling those members in sequence (lines 23..28).

So, apromise allows us to use a thread to compute a value (orexception, see below), which value may then be collected by another thread atsome future point in time. The promise remains available, and as a consequencefurther synchronization of the threads and the program starting the threads isnot necessary. When the promise object contains an exception, rather than avalue, its future'sget member rethrows the stored exception.

Here is the classpromise's interface. Note that the classpromise isa class template: its template type parameterReturnType specifies thetemplate type parameter of thestd::future that can be retrieved from it.

Constructors and destructor:

promise():
The default constructor constructs apromise object containing a shared state. The shared state may be returned by the memberget_future (see below), but that future has not yet been made ready;
promise(promise &&tmp) noexcept:
The move constructor constructs apromise object, transferring the ownership oftmp's shared state to the newly constructed object. After the object has been constructed,tmp no longer contains a shared state;
~promise():
The object's shared state (if any) is abandoned;

Member functions:

std::future<ReturnType> get_future():
Astd::future object sharing the current object's shared state is returned. Afuture_error exception is thrown upon error, containing
- future_already_retrieved ifget_future was already called on apackaged_task object containing the same shared state as the current object;
- no_state if the current object has no shared state.
Note: Anyfutures that share the object's shared state may access the result returned by the object's task;
promise &operator=(promise &&rhs) noexcept:
The move assignment operator first releases the current object's shared state (if available), after which the current object andtmp are swapped;
void promise<void>::set_value():
See below, at the lastset_value member's description;
void set_value(ReturnType &&value):
See below, at the lastset_value member's description;
void set_value(ReturnType const &value):
See the next member function's description;
void set_value(ReturnType &value):
The argument (value) is atomically stored in the shared state, which is then also made ready. Afuture_error exception is thrown upon error, containing
- promise_already_satisfied if the shared state has already been made ready;
- no_state if the current object does not have any shared state.
Alternatively, any exception thrown byvalue's move or copy constructor may be thrown;
void set_exception(std::exception_ptr obj):
Exception_ptr obj (cf. section10.9.4) is atomically stored in the shared state, making that state ready. Afuture_error exception is thrown upon error, containing
- promise_already_satisfied if the shared state has already been made ready;
- no_state if the current object does not have any shared state;
void set_exception_at_thread_exit(exception_ptr ptr):
The exception pointerptr is stored in the shared state without immediately making that state ready. The state becomes ready when the current thread exits, once all objects of thread storage duration which are associated with the ending thread have been destroyed. Afuture_error exception is thrown upon error, containing
- promise_already_satisfied if the shared state has already been made ready;
- no_state if the current object does not have any shared state;
void set_value_at_thread_exit():
See below, at the lastset_value_at_thread_exit member's description;
void set_value_at_thread_exit(ReturnType &&value):
See below, at the lastset_value_at_thread_exit member's description;
void set_value_at_thread_exit(ReturnType const &value):
See the nextset_value_at_thread_exit member's description;
void set_value_at_thread_exit(ReturnType &value):
Storesvalue in the shared state without immediately making that state ready. The state becomes ready when the current thread exits, once all objects of thread storage duration which are associated with the ending thread have been destroyed. Afuture_error exception is thrown upon error, containing
- promise_already_satisfied if the shared state has already been made ready;
- no_state if the current object does not have any shared state;
void swap(promise& other) noexcept:
The shared states (if any) of the current object andother are exchanged.

The following non-member (free) function operating onpromise objects isavailable:

void swap(promise<ReturnType> &lhs, promise<ReturnType> &rhs) noexcept:
Callslhs.swap(rhs)

20.13: An example: multi-threaded compilations

In this section another program is developed. This section's example programillustrates the use ofpackaged_tasks.

Like the multi-threaded quicksort example a worker pool is used. However, inthis example the workers in fact do not know what their task is. In thecurrent example the tasks happens to be identical, but different tasks mightas well have been used, without having to update the workers.

The program uses aclass Task containing a command-specification(d_command), and a task specification (d_task) (cf. Figure27), thesources of the program are found in theyo/threading/examples/multicompile directory of theC++ Annotations.

Figure 27: Data structure used for the multi-threading compilation

In this programmain starts by firing up its workforce in a series ofthreads. Following this, the compilation jobs are prepared and pushed on atask-queue byjobs, where they're retrieved from by the workers. Oncethe compilations have been completed (i.e., after the worker threads havejoined the main thread), the results of the compilation jobs are handled byresults:

    int main()    {        workforce();                    // start the worker threads        jobs();                         // prepare the jobs: push all tasks on the                                        // taskQ            for (thread &thr: g_thread)     // wait for the workers to end            thr.join();            results();                      // show the results        }

Thejobs function receives the names of the files to compile from thenextCommand function, which ignores empty lines and returns non-emptylines. EventuallynextCommand returns an empty line once all lines of thestandard input stream have been read:

    string nextCommand()    {        string ret;        while (true)        {            if (not getline(cin, ret))    // no more lines                break;                if (not ret.empty())          // ready once there's line content.                break;        }        return ret;    }

With non-empty linesjobs waits for an available worker using (line 12)theg_dispatcher semaphore. Initialized to the size of the work force, itis reduced by an active worker, and incremented by workers who have completedtheir tasks. If a compilation fails, theng_done is set totrue and noadditional compilations are performed (lines 14, 15). Whilejobs receivesthe names of the files to compile, workers may detect compilation errors. Ifso, the workers set variableg_done totrue. Once thejobfunction'swhile loop ends the workers are notified once again (line 24),who will then, because there's no task to perform anymore, end their threads

     1: void jobs()     2: {     3:     while (true)     4:     {     5:         string line = nextCommand();     6:         if (line.empty())                   // no command? jobs() done.     7:         {     8:             g_done = true;     9:             break;    10:         }    11:     12:         g_dispatcher.wait();                // wait for an available worker    13:     14:         if (g_done.load())                  // if a worker found an error    15:             break;                          // then quit anyway    16:     17:         newTask(line);                      // push a new task (and its    18:                                             // results)    19:     20:         g_worker.notify_all();              // inform the workers: job is    21:                                             // available    22:     }    23:     24:     g_worker.notify_all();                  // end the workers at an empty Q    25: }

The functionnewTask prepares the program for the next task. First aTask object is constructed.Task contains the name of the file tocompile, and apackaged_task. It encapsulates all activities that areassociated with apackaged_task. Here is its (in-class) definition:

     1: using PackagedTask = packaged_task<Result (string const &fname)>;     2:      3: class Task     4: {     5:     string d_command;     6:     PackagedTask d_task;     7:      8:     public:     9:         Task()  = default;    10:     11:         Task(string const &command, PackagedTask &&tmp)    12:         :    13:             d_command(command),    14:             d_task(move(tmp))    15:         {}    16:     17:         void operator()()    18:         {    19:             d_task(d_command);    20:         }    21:     22:         shared_future<Result> result()    23:         {    24:             return d_task.get_future().share();    25:         }    26: };

Note (lines 22-25) thatresult returns ashared_future. Since thedispatcher runs in a different thread than the one processing the results, thefutures created by the dispatcher must be shared with thefuturesrequired by the function processing the results. Hence theshared_futuresreturned byTask::result.

Once aTask object has been constructed itsshared_future object ispushed on the result queue. Although the actual results aren't available bythis time, theresult function is eventually called to process the resultsthat were pushed on the result-queue. Additionally, theTask itself ispushed on atask queue, and it will be retrieved by a worker:

    class Task    {        string d_command;        PackagedTask d_task;        public:            Task()  = default;            Task(string const &command, PackagedTask &&tmp);            void operator()();            shared_future<Result> result();    };

    void pushResultQ(shared_future<Result> const &sharedResult)    {        lock_guard<mutex> lk(g_resultQMutex);        g_resultQ.push(sharedResult);    }

The workers have a simple task: wait for the next task, then retrieve it fromthe task queue, and complete that task. Whatever happens inside the tasksthemselves is of no concern to the worker. Also, when notified (normally bythejobs function) that there's a task waiting it'll execute thattask. However, at the end, once all tasks have been pushed on the task queue,jobs once again notifies the workers. In that case the task queue isempty, and the worker function ends. But just before that it notifies itsfellow workers, which in turn end, thus ending all worker threads, allowingthem to join themain-thread:

    void worker()    {        Task task;        while (true)        {            g_worker.wait();                    // wait for an available task                if (g_taskQ.empty())                // no task? then done                break;                g_taskQ.popFront(task);            g_dispatcher.notify_all();          // notify the dispatcher that                                                // another task can be pushed            task();        }        g_worker.notify_all();                  // no more tasks: notify the other    }                                           // workers.

This completes the description of how tasks are handled.

The task itself are now described. In the current programC++ source filesare compiled. The compilation command is passed to the constructor of aCmdFork object, which starts the compiler as a child process. The resultof the compilation is retrieved via itschildExit member (returning thecompiler's exit code) andchildOutput member (returning any textual outputproduced by the compiler). If compilation fails, the exit value won't bezero. In this case no further compilation tasks will be issued asg_doneis set totrue (lines 11 and 12; the implementation of theclassCmdFork is available from theC++ Annotations'yo/threading/examples/cmdfork directory). Here is the functioncompile:

     1: Result compile(string const &line)     2: {     3:     string command("/usr/bin/g++ -Wall -c " + line);     4:      5:     CmdFork cmdFork(command);     6:     cmdFork.fork();     7:      8:     Result ret {cmdFork.childExit() == 0,     9:                 line + "\n" + cmdFork.childOutput()};    10:     11:     if (not ret.ok)    12:         g_done = true;    13:     14:     return ret;    15: }

Theresults function continues for as long asnewResults indicatesthat results are available. By design the program will show all availablesuccessfully completed compilations, and (if several workers encounteredcompilation errors) only the compiler's output of the first compilation error is displayed. Allavailable successfully completedcompilations meaning that, in case of a compilation error, the source filesthat were successfully compiled by the currently active work force are listed,but remaining source files are not processed anymore:

    void results()    {        Result result;            string errorDisplay;            while (newResult(result))        {            if (result.ok)                cerr << result.display;            else if (errorDisplay.empty())                errorDisplay = result.display;  // remember the output of        }                                       // the first compilation error        if (not errorDisplay.empty())           // which is displayed at the end            cerr << errorDisplay;    }

The functionnewResult controlsresults' while-loop. It returnstrue when as long as the result queue isn't empty, in which case thequeue's front element is stored at the externalResult object, and thequeue's front element is removed from the queue:

    bool newResult(Result &result)    {        if (g_resultQ.empty())            return false;            result = g_resultQ.front().get();        g_resultQ.pop();            return true;    }

20.14: Transactional Memory

Transactional memory is used to simplify shared data access in multithreadedprograms. The benefits of transactional memory is best illustrated by asmall program. Consider a situation where threads need to write information toa file. A plain example of such a program would be:

    void fun(int value)    {        for (size_t rept = 0; rept != 10; ++rept)        {            this_thread::sleep_for(chrono::seconds(1));            cout << "fun " << value << '\n';        }    }        int main()    {        thread thr{ fun, 1 };        fun(2);        thr.join();    }

When this program is run thefun 1 andfun 2 messages areintermixed. To prevent this we traditionally define amutex, lock it,write the message, and release the lock:

    void fun(int value)    {        static mutex guard;        for (size_t rept = 0; rept != 10; ++rept)        {            this_thread::sleep_for(chrono::seconds(1));            guard.lock();            cout << "fun " << value << '\n';            guard.unlock();        }    };

Transactional memory handles the locking for us. Transactional memory is usedwhen statements are embedded in asynchronized block. The functionfun, using transactional memory, looks like this:

    void fun(int value)    {        for (size_t rept = 0; rept != 10; ++rept)        {            this_thread::sleep_for(chrono::seconds(1));            synchronized            {                cout << "fun " << value << '\n';            }        }    };

To compile source files using transactional memory theg++ compileroption-fgnu-tm must be specified.

The code inside a synchronized block is executed as a single, as if theblock was protected by a mutex. Different from using mutexes transactionalmemory is implemented in software instead of using hardware-facilities.

Considering how easy it is to use transactional memory compared to usingthemutex-based locking mechanism using transactional memory appearstoo good to be true. And in a sense it is. When encountering a synchronizedblock the thread unconditionally executes the block's statements. At the sametime it keeps a detailed log of all its actions. Once the statements have beencompleted the thread checks whether another thread didn't start executing theblock just before it. If so, it reverses its actions, using the synchronizedblock's log. The implication of this should be clear: there's at least theoverhead of maintaining the log, andif another thread started executing the synchronized block before the current thread then there's the additionaloverhead of reverting its actions and to try again.

The advantages of transactional memory should also be clear: theprogrammers no longer is responsible for correctly controlling access toshared memory; risks of encountering deadlocks have disappeared as has alladminstrative overhead of defining mutexes, locking and unlocking. Especiallyfor inherently slow operations like writing to files transactional memory cangreatly simplify parts of your code. Consider astd::stack. Itstop-element can be inspected but itspop member does not return thetopmost element. To retrieve the top element and then maybe remove ittraditionally requires a mutex lock surrounding determining the stack's size,and if empty, release the lock and wait. If not empty then retrieve itstopmost element, followed by removing it from the stack. Using a transactionalmemory we get something as simple as:

    bool retrieve(stack<Item> &itemStack, Item &item)    {        synchronized        {            if (itemStack.empty())                return false;            item = std::move(itemStack.top());            itemStack.pop();            return true;        }    }

Variants ofsynchronized are:

atomic_noexcept: the statements inside its compound statement may not throw exceptions. If they do,std::abort is called. If the earlierfun function specifiesatomic_noexcept instead ofsynchronized the compiler generates and error about the use of the insertion operator, from which an exception may be thrown.
atomic_cancel: not yet supported byg++. If an exception other than (std::)bad_alloc, bad_array_new_length, bad_cast, bad_typeid, bad_exception, exception, tx_exception<Type> is thrownstd::abort is called. If an acceptable exception is thrown, then the statements executed so far are undone.
atomic_commit: if an exception is thrown from its compound statement all thus far executed statements are kept (i.e., not undone).

20.15: Synchronizing output to streams

Consider the situation where different threads of a multi-threaded programmust write to the same file. The information written by each thread shouldshow up as a single block in that file. There are several ways to solve thisproblem: each thread could write to a global file that's associated with justone thread, and by the time the threads have stopped all these files arecopied to the destination file. Alternatively, the destination file could bepassed to the threads, while each thread defines its own local file, writingits information to that file. Then, by the time the thread is about to end itlocks access to the destination file, and copies its local file to thedestination file.

Recently the classstd::osyncstream was added to the language, allowingmulti threaded programs allowing threads to write information block-wise to acommon stream without having to define separate streams receiving thethread-specific information, eventually copying those streams to thedestination stream. Before usingosyncstream objects the<syncstream>header file must be included.

Theosyncstream class publicly inherits fromstd::ostream,initializing theostream base class with astd::syncbuf stream buffer(described in the next section), which performs the actual synchronization.

Information written toosyncstream objects can explicitly be copied to adestinationostream, or is automatically copied to the destinationostream by theosyncstream's destructor. Each thread may construct itsownosyncstream object, handling the block-wise copying of theinformation it receives to the destination stream.

Constructors

osyncstream{ostream &out} constructs anosyncstream object eventually writing the information it receives toout. Below,out is called thedestination stream;
osyncstream{osyncstream &&tmp} the move constructor is available;

The default- and copy-constructors are not available.

Member functions

In addition to the members inherited fromstd::ostream (like therdbuf member returing a pointer to the object'ssyncbuf (described inthe next section)) the classosyncstream offers these members:

get_wrapped, returning a pointer to the destination stream's stream buffer;
emit, copies the received information as a block to the destination stream.

The following program illustrates howosyncstream objects can be used.

 1: #include <iostream> 2: #include <syncstream> 3: #include <string> 4: #include <thread> 5:  6: using namespace std; 7:  8: void fun(char const *label, size_t count) 9: {10:     osyncstream out(cout);11: 12:     for (size_t idx = 0; idx != count; ++idx)13:     {14:         this_thread::sleep_for(1s);15:         out << label << ": " << idx << " running...\n";16:     }17:     out << label << " ends\n";18: }19: 20: int main(int argc, char **argv)21: {22:     cout << "the 1st arg specifies the #iterators "23:             "using 3 iterations by default\n";24: 25:     size_t count = argc > 1 ? stoul(argv[1]) : 3;26: 27:     thread thr1{ fun, "first", count };28:     thread thr2{ fun, "second", count };29: 30:     thr1.join();31:     thr2.join();32: }

The functionfun (line 8) is called bymain from two threads (lines 27, 28);
It defines anosyncstream out and, using short one-second pauses, writes some lines of text toout (lines 14, 15);
Just before leavingfun the localout content is written as a block tocout (line 18). Writingout's content tocout can also explicitly be requested by callingout.emit().

20.15.1: The `std::syncbuf' streambuf

Theosyncstream stream in fact is only a wrapper ofostream, using asyncbuf as its stream buffer. Thestd::syncbuf handles the actualsynchronization. In order to use thesyncbuf stream buffer the<syncstream> header file must be included.

Asyncbuf stream buffer collects the information it receives from anostream in an internal buffer, and its destructor andemit member flush its buffer as a block to its destination stream.

Constructors

syncbuf(), the default constructor, constructs asyncbuf object with itsemit-on-sync policy (see below) set tofalse;
explicit syncbuf(streambuf *destbuf) constructs astd::syncbuf with its emit-on-sync policy set tofalse, usingdestbuf as the destination stream'sstreambuf;
syncbuf(syncbuf &&rhs), the move constructor, moves the content ofrhs to the constructedsyncbuf.

Member functions

In addition to the members inherited fromstd::streambuf the classsyncbuf offers these members:

get_wrapped, returning a pointer to the destination stream's stream buffer;
emit, copies the received information as a block to the destination stream;
void set_emit_on_sync(bool how) changes the current emit-on-sync policy. By defaulthow == false flushing its internal buffer to the destination's stream buffer. Whenhow == true the internal buffer is always immediately flushed;

20.15.2: Multi-threaded compilations using `osyncstream'

Section20.13 describes the construction of a multi-threaded programperforming compilations. In that program separate threads were used for theworkers, who push their results on a results-queue. At the end of theprogram the functionresults processes the queued results by showing thenames of the successfully compiled source files, and (if a compilation failed)the name and error messages of the first source whose compilation failed.

The results-queue was used to store the results in a retrievable datastructure, using a mutex to ensure that the workers cannot simultaneouslypush results on the results-queue.

Usingosyncstream objects the results-queue and its mutexed protectionscheme is no longer required (the sources of the modified program areavailable in theC++ Annotations' directoryyo/threading/examples/osyncmulticompile).

Instead of using a results-queue the program uses a single destination streamfstream g_out{ "/tmp/out", ios::trunc | ios::in | ios::out }, and itscompile function defines a local aosyncstream object, ensuring thatits output is sent as a block tog_out:

     1: void compile(string const &line)     2: {     3:     if (g_done.load())     4:         return;     5:      6:     string command("/usr/bin/g++ -Wall -c " + line);     7:      8:     CmdFork cmdFork(command);     9:     cmdFork.fork();    10:     11:     int exitValue = cmdFork.childExit();    12:     13:     osyncstream out(g_out);    14:     out << exitValue << ' ' << line << '\n';    15:     16:     if (exitValue != 0)    17:     {    18:         out << cmdFork.childOutput() << '\n' << g_marker << '\n';    19:         g_done = true;    20:     }    21:     // out.emit();          // handled by out's destructor    22: }

at line 13 theosyncstream out object is defined, and the results of the compilation are written toout at lines 14 and 18;
at line 14 the result of the compilation followed by the name of the source file is inserted intoout;
if a compilation fails then, at line 18, the compiler's error messages are inserted intoout terminated by a marker, used byresults (see below), to recognize the end of the error messages.

Since the results of the compilation are no longer transferred to anotherthread, there's no need for defining ashared_future<Result>. In fact,sincecompile handles the results of a compilation itself, it definesreturn typevoid and thepackaged_task itself doesn't return anythingeither. Therefore theclass Task doesn't need aresult() memberanymore. Instead, its function-call operator, having completed its task, callsthe task'sget_future so exceptions that might have been generated bythepackaged_tasks are properly retrieved. Here's the simplifiedclassTask:

    using PackagedTask = packaged_task<void (string const &fname)>;        class Task    {        string d_command;        PackagedTask d_task;            public:            Task()  = default;                Task(string const &command, PackagedTask &&tmp)            :                d_command(command),                d_task(move(tmp))            {}                void operator()()            {                d_task(d_command);                d_task.get_future();    // handles potential exceptions            }    };

At the end ofmain the functionresults is called:

     1: void results()     2: {     3:     g_out.seekg(0);     4:      5:     int value;     6:     string line;     7:     string errorDisplay;     8:      9:     while (g_out >> value >> line)      // process g_out's content    10:     {    11:         g_out.ignore(100, '\n');    12:     13:         if (value == 0)                 // no error: show the source file    14:         {    15:             cerr << line << '\n';    16:             continue;    17:         }    18:                                         // at compilation errors:    19:         if (not errorDisplay.empty())   // after the 1st error: skip    20:         {    21:             do    22:             {    23:                 getline(g_out, line);    24:             }    25:             while (line != g_marker);    26:     27:             continue;    28:         }    29:                                         // first compilation error:    30:         errorDisplay = line + '\n';     // keep the the name of the source    31:         while (true)                    // and its error messages    32:         {    33:             getline(g_out, line);    34:     35:             if (line == g_marker)    36:                 break;    37:     38:             errorDisplay += line + '\n';    39:         }    40:     }    41:     42:     cerr << errorDisplay;               // eventually insert the error-info    43: }                                       // (if any)

Each compilation starts with a compilation result and a source name. These are extracted in thewhile condition at line 9.
If the compilation was successful (line 13) the source's name is displayed.
If not, only the info of the first failed compilation is displayed (all failed compilation messages could of course also be displayed, but this program only shows the messages of the first encountered failing compilation). If a compilation has already been encountered then the next error messages are ignored (lines 19 thru 28).
The info of the first encountered compilation error is collected inerrorDisplay (lines 30 thru 39).
Onceg_out has completely been readerrorDisplay is displayed (line 42), which is either empty or contains the error messages of the first encountered compilation failure.

[8]ページ先頭