Multi-threading is an extensive and complex subject, and manygood reference texts on the subject exist. TheC++ multi-threading isbuilt upon the facilities offered by thepthreads library (cf.Nichols, B,et al.'sPthreads Programming, O'Reilly). However,in line withC++'s current-day philosophy the multi-threadingimplementation offered by the language offers a high level interface tomulti-threading, and using the raw pthread building blocks is hardly evernecessary (cf.Williams, A. (2019):C++ Concurrency in action).
This chapter covers the facilities for multi-threading as supported byC++. Although the coverage aims at providing the tools and examplesallowing you to create your own multi-threaded programs, coverage necessarilyis far from complete. The topic of multi threading is too extensive forthat. The mentioned reference texts provide a good starting point for anyfurther study of multi threading.
Athread of execution (commonly abbreviated to athread) is a singleflow of control within a program. It differs from a separately executedprogram, as created by thefork(1) system call in the sense that threadsall run inside one program, whilefork(1) creates independent copies of arunning program. Multi-threading means that multiple tasks are being executedin parallel inside one program, and no assumptions can be made as to whichthread is running first or last, or at what moment in time. Especially whenthe number of threads does not exceed the number of cores, each thread may beactive at the same time. If the number of threads exceed the number of cores,the operating system will resort totask switching, offering each threadtime slices in which it can perform its tasks. Task switching takes time, andthe law of diminishing returns applies here as well: if the number of threadsgreatly exceeds the number of available cores (also calledoverpopulation), then the overhead incurred may exceed the benefit ofbeing able to run multiple tasks in parallel.
Since all threads are running inside one single program, all threads share theprogram's data and code. When the same data are accessed by multiple threads,and at least one of the threads is modifying these data, access must besynchronized to avoid that threads read data while these data are beingmodified by other threads, and to avoid that multiple threads modify the samedata at the same time.
So how do we run a multi-threaded program inC++? Let's look athelloworld, the multi-threaded way:
1: #include <iostream> 2: #include <thread> 3: 4: void hello() 5: { 6: std::cout << "hello world!\n"; 7: } 8: 9: int main() 10: { 11: std::thread hi(hello); 12: hi.join(); 13: }thread is included, informing the compilerabout the existence of the classstd::thread (cf. section20.1.2);std::thread hi object is created. It is providedwith the name of a function (hello) which will be called in a separatethread. Actually, the second thread, runninghello, is immediately startedwhen astd::thread is defined this way;main function itself also represents a thread: the program'sfirst thread. It should wait until the second thread has finished. Thisis realized in line 12, wherehi.join() waits until the threadhihas finished its job. Since there are no further statements inmain, theprogram itself ends immediately thereafter.hello itself, defined in lines 4 through 7, istrivial: it simply inserts the text `hello world' intocout, and terminates, thus ending the second thread.C++'s main tool for creating multi-threaded programs is the classstd::thread, and some examples of its use have already been shown at thebeginning of this chapter.
Characteristics of individual threads can be queried from thestd::this_thread namespace. Also,std::this_thread offers some controlover the behavior of an individual thread.
To synchronize access to shared dataC++ offersmutexes (implementedby the classstd::mutex) andcondition variables (implemented by theclassstd::condition_variable).
Members of these classes may throwsystem_error objects (cf. section10.9) when encountering a low-level error condition.
namespace std::this_threadcontains functions that are uniquely associated with the currently runningthread.Before using the namespacethis_thread the<thread> header file mustbe included.
Inside thestd::this_thread namespace several free functions are defined,providing information about the current thread or that can be used to controlits behavior:
thread::id this_thread::get_id() noexcept:thread::id that identifies the currentlyactive thread of execution. For an active thread the returnedid is uniquein the sense that it maps 1:1 to the currently active thread, and is notreturned by any other thread. If the thread is currently not running then thedefaultthread::id object is returned by thestd::threadobject'sget_id member.void yield() noexcept:this_thread::yield() the current thread isbriefly suspended, allowing other (waiting) threads to start.void sleep_for(chrono::duration<Rep, Period> const &relTime) noexcept:this_thread::sleep_for(...) it is suspendedfor the amount of time that's specified in its argument. E.g.,std::this_thread::sleep_for(std::chrono::seconds(5));
void sleep_until(chrono::time_point<Clock, Duration> const &absTime) noexcept:absTime is in the past. The next example has the same effect as theprevious example:// assume using namespace stdthis_thread::sleep_until(chrono::system_clock().now() + chrono::seconds(5));
Conversely, thesleep_until call in the next example immediatelyreturns:
this_thread::sleep_until(chrono::system_clock().now() - chrono::seconds(5));
std::thread. Each object of this class handles a separatethread.Before usingThread objects the<thread> header file must be included.
Thread objects can be constructed in various ways:
thread() noexcept:thread object. As it receives nofunction to execute, it does not start a separate thread of execution. It isused, e.g., as a data member of a class, allowing class objects to start aseparate thread at some later point in time;thread(thread &&tmp) noexcept:tmp, whiletmp, if it runs a thread, loses control over itsthread. Following this,tmp is in its default state, and the newly createdthread is responsible for calling, e.g.,join.explicit thread(Fun &&fun, Args &&...args):thread's constructor immediately followingits first (function) argument. Additional arguments are passed with theirproper types and values tofun. Following thethread object'sconstruction, a separately running thread of execution is started.The notationArg &&...args indicates that any additional arguments arepassed as is to the function. The types of the arguments that are passed tothethread constructor and that are expected by the called function mustmatch: values must be values, references must be reference, r-value referencesmust be r-value references (or move construction must be supported). Thefollowing example illustrates this requirement:
1: #include <iostream> 2: #include <thread> 3: 4: using namespace std; 5: 6: struct NoMove 7: { 8: NoMove() = default; 9: NoMove(NoMove &&tmp) = delete; 10: }; 11: 12: struct MoveOK 13: { 14: int d_value = 10; 15: 16: MoveOK() = default; 17: MoveOK(MoveOK const &) = default; 18: 19: MoveOK(MoveOK &&tmp) 20: { 21: d_value = 0; 22: cout << "MoveOK move cons.\n"; 23: } 24: }; 25: 26: void valueArg(int value) 27: {} 28: void refArg(int &ref) 29: {} 30: void r_refArg(int &&tmp) 31: { 32: tmp = 100; 33: } 34: void r_refNoMove(NoMove &&tmp) 35: {} 36: void r_refMoveOK(MoveOK &&tmp) 37: {} 38: 39: int main() 40: { 41: int value = 0; 42: 43: std::thread(valueArg, value).join(); 44: std::thread(refArg, ref(value)).join(); 45: std::thread(r_refArg, move(value)).join(); 46: 47: // std::thread(refArg, value); 48: 49: std::thread(r_refArg, value).join(); 50: cout << "value after r_refArg: " << value << '\n'; 51: 52: // std::thread(r_refNoMove, NoMove()); 53: 54: NoMove noMove; 55: // std::thread(r_refNoMove, noMove).join(); 56: 57: MoveOK moveOK; 58: std::thread(r_refMoveOK, moveOK).join(); 59: cout << moveOK.d_value << '\n'; 60: }std::thread: with the functions running the threads expecting matching argument types.refArg. Note that this problem was solved in line 43by using thestd::ref function.int values andclass-types supporting move operations can be passed as values to functionsexpecting r-value references. In this case notice that the functions expectingthe r-value references do not access the provided arguments (except for theactions performed by their move constructors), but use move construction tocreate temporary values or objects on which the functions operate.NoMove struct doesn't offera move constructor.struct Demo{ int d_value = 0; void fun(int value) { d_value = value; cout << "fun sets value to " << value << "\n"; }};int main(){ Demo demo; thread thr{&Demo::fun, ref(demo), 12 }; thr.join(); cout << "demo's value: " << demo.d_value << '\n'; // 12 thr = thread{&Demo::fun, &demo, 42 }; thr.join(); cout << "demo's value: " << demo.d_value << '\n'; // 42 thr = thread{&Demo::fun, demo, 77 }; thr.join(); cout << "demo's value: " << demo.d_value << '\n'; // 42: the thread // copied demo}Be careful when passing local variables as arguments to thread objects: ifthe thread continues to run when the function whose local variables are usedterminates, then the thread suddenly uses wild pointers or wild references, asthe local variables no longer exist. To prevent this from happening(illustrated by the next example) do as follows:
thread constructor, orjoin on the thread object to ensure that the thread hasfinished within the local variable's lifetime. 1: #include <iostream> 2: #include <thread> 3: #include <string> 4: #include <chrono> 5: 6: void threadFun(std::string const &text) 7: { 8: for (size_t iter = 1; iter != 6; ++iter) 9: { 10: std::cout << text << '\n'; 11: std::this_thread::sleep_for(std::chrono::seconds(1)); 12: } 13: } 14: 15: std::thread safeLocal() 16: { 17: std::string text = "hello world"; 18: return std::thread(threadFun, std::string{ text }); 19: } 20: 21: int main() 22: { 23: std::thread local(safeLocal()); 24: local.join(); 25: std::cout << "safeLocal has ended\n"; 26: } In line 18 be sure not to callstd::ref(text) instead ofstd::string{ text }.If the thread cannot be created astd::system_error exception isthrown.
Since this constructor not only accepts functions but also function objects asits first argument, alocal context may be passed to the functionobject's constructor. Here is an example of a thread receiving a functionobject using a local context:
#include <iostream> #include <thread> #include <array> using namespace std; class Functor { array<int, 30> &d_data; int d_value; public: Functor(array<int, 30> &data, int value) : d_data(data), d_value(value) {} void operator()(ostream &out) { for (auto &value: d_data) { value = d_value++; out << value << ' '; } out << '\n'; } }; int main() { array<int, 30> data; Functor functor{ data, 5 }; thread funThread{ functor, ref(cout) }; funThread.join(); };std::thread does not provide a copy constructor.The following members are available:
thread &operator=(thread &&tmp) noexcept:If the operator's left-hand side operand (lhs) is a joinablethread, thenterminateis called. Otherwise,tmpis assigned to theoperator's lhs andtmp'sstate is changed to the thread's default state(i.e.,thread()).
void detach():joinable (see below) to returntrue. The thread forwhichdetach is called continues to run. The (e.g., parent) thread callingdetach continues immediately beyond thedetach-call. After callingobject.detach(), `object' no longer represents the (possibly stillcontinuing but now detached) thread of execution. It is the detached thread'simplementation's responsibility to release its resources when its executionends.Sincedetach disconnects a thread from the running program, e.g.,main no longer can wait for the thread's completion. As a programends whenmain ends, its still running detached threads also stop, anda program may not properly finish all its threads, as demonstrated bythe following example:
#include <thread> #include <iostream> #include <chrono> void fun(size_t count, char const *txt) { for (; count--; ) { std::this_thread::sleep_for(std::chrono::milliseconds(100)); std::cout << count << ": " << txt << std::endl; } } int main() { std::thread first(fun, 5, "hello world"); first.detach(); std::thread second(fun, 5, "a second thread"); second.detach(); std::this_thread::sleep_for(std::chrono::milliseconds(400)); std::cout << "leaving" << std::endl; }A detached thread may very well continue to run after the function thatlaunched it has finished. Here, too, you should be very careful not to passlocal variables to the detached thread, as their references or pointers willbe undefined once the function defining the local variables terminates:
#include <iostream>#include <thread>#include <chrono>using namespace std;using namespace chrono;void add(int const &p1, int const &p2){ this_thread::sleep_for(milliseconds(200)); cerr << p1 << " + " << p2 << " = " << (p1 + p2) << '\n';}void run(){ int v1 = 10; int v2 = 20;// thread(add, ref(v1), ref(v2)).detach(); // DON'T DO THIS thread(add, int(v1), int(v2)).detach(); // this is OK: own copies}int main(){ run(); this_thread::sleep_for(seconds(1));}id get_id() const noexcept:thread::id() is returned. Otherwise, the thread's unique ID(also obtainable from within the thread viathis_thread::get_id()) isreturned.unsigned thread::hardware_concurrency() noexecpt:void join():joinable to returntrue. If the thread for whichjoin is called hasn't finished yet then the thread callingjoin willbe suspended (also calledblocked) until the thread for whichjoin iscalled has completed. Following its completion the object whosejoinmember was called no longer represents a running thread, and itsget_idmember will returnstd::thread::id().This member was used in several examples shown so far. As noted: whenmain ends while a joinable thread is still running,terminate iscalled, aborting the program.
bool joinable() const noexcept:object.get_id() != id(), whereobject is thethread object for whichjoinable was called.native_handle_type native_handle():pthread_getschedparam andpthread_setschedparam to get/set the thread's scheduling policy andparameters.void swap(thread &other) noexcept:thread object for whichswap was called andother are swapped. Note that threads may always be swapped, even whentheir thread functions are currently being executed.Things to note:
join. E.g.,void doSomething();int main(){ thread(doSomething); // nothing happens?? thread(doSomething).join(); // doSomething is executed??}This similar to the situation we encountered in section7.5:the first statement doesn't define an anonymousthread object at all. Itsimply defines thethread objectdoSomething. Consequently,compilation of the second statement fails, as there is nothread(thread &)constructor. When the first statement is omitted, thedoSomething functionis executed by the second statement. If the second statement is omitted, adefault constructedthread object by the name ofdoSomething isdefined.
thread object(thread(doSomething));
the move constructor is used to transfer control from an anonymous threadexecutingdoSomething to the threadobject. Only afterobject'sconstruction has completeddoSomething is started in the separate thread.
packaged_task and afuture (cf., respectively, sections20.11 and20.8).A thread ends when the function executing a thread finishes. When athread object is destroyed while its thread function is still running,terminate is called, aborting the program's end. Bad news: the destructorsof existing objects aren't called and exceptions that are thrown are leftuncaught. This happens in the following program as the thread is still activewhenmain ends:
#include <iostream> #include <thread> void hello() { while (true) std::cout << "hello world!\n"; } int main() { std::thread hi(hello); } There are several ways to solve this problem. One of them is discussed inthe next section.Thethread_local keyword provides this intermediate data level. Globalvariables declared asthread_local are global within each individualthread. Each thread owns a copy of thethread_local variables, and maymodify them at will. Athread_local variable in one thread is completelyseparated from that variable in another thread. Here is an example:
1: #include <iostream> 2: #include <thread> 3: 4: using namespace std; 5: 6: thread_local int t_value = 100; 7: 8: void modify(char const *label, int newValue) 9: { 10: cout << label << " before: " << t_value << ". Address: " << 11: &t_value << '\n'; 12: t_value = newValue; 13: cout << label << " after: " << t_value << '\n'; 14: } 15: 16: int main() 17: { 18: thread(modify, "first", 50).join(); 19: thread(modify, "second", 20).join(); 20: modify("main", 0); 21: }thread_local variablet_value is defined. It isinitialized to 100, and that becomes the initial value for each separatelyrunning thread;modify is defined. It assignsa new value tot_value;modify.t_value being 100, and then modifies it without affecting the values oft_value used by other threads.Note that, although thet_value variables are unique to each thread,identical addresses may be shown for them. Since each thread uses its ownstack, these variables may occupy the same relative locations within theirrespective stacks, giving the illusion that their physical addresses areidentical.
void childActions(); void doSomeWork(); void parent() { thread child(childActions); doSomeWork(); child.join(); }However, maybedoSomeWork can't complete its work, and throws anexception, to be caught outside ofparent. This, unfortunately, endsparent, andchild.join() is missed. Consequently, the program abortsbecause of a thread that hasn't been joined.
Clearly, all exceptions must be caught,join must be called, and theexception must be rethrown. Butparent cannot use a function try-block, asthe thread object is already out of scope once execution reaches the matchingcatch-clause. So we get:
void childActions(); void doSomeWork(); void parent() { thread child(childActions); try { doSomeWork(); child.join(); } catch (...) { child.join(); throw; } }This is ugly: suddenly the function's code is clobbered with atry-catch clause, as well as some unwelcome code-duplication.
This situation can be avoided using object based programming. Like, e.g.,unique pointers, which use their destructors to encapsulate the destruction ofdynamically allocated memory, we can use a comparable technique to encapsulatethread joining in an object's destructor.
By defining thethread object inside a class we're sure that by the timethe our object goes out of scope, even if thechildActions functionthrows an exception, the thread'sjoin member is called. Here are the bareessentials of ourJoinGuard class, providing the join-guarantee (usingin-line member implementations for brevity):
1: #include <thread> 2: 3: class JoinGuard 4: { 5: std::thread d_thread; 6: 7: public: 8: JoinGuard(std::thread &&threadObj) 9: : 10: d_thread(std::move(threadObj)) 11: {} 12: ~JoinGuard() 13: { 14: if (d_thread.joinable()) 15: d_thread.join(); 16: } 17: };thread object, which is moved, in line 10, toJoinGuard's d_threaddata member.JoinGuard object ceases to exist, its destructor (line12) makes sure the thread is joined if it's still joinable (lines 14 and 15).JoinGuard could be used: 1: #include <iostream> 2: #include "joinguard.h" 3: 4: void childActions(); 5: 6: void doSomeWork() 7: { 8: throw std::runtime_error("doSomeWork throws"); 9: } 10: 11: void parent() 12: { 13: JoinGuard{std::thread{childActions}}; 14: doSomeWork(); 15: } 16: 17: int main() 18: try 19: { 20: parent(); 21: } 22: catch (std::exception const &exc) 23: { 24: std::cout << exc.what() << '\n'; 25: }childActions is declared. Its implementation (notprovided here) defines the child thread's actions.main function (lines 17 through 25) provides the function try-block to catch the exception thrown byparent;parent function defines (line 13) an anonymousJoinGuard,receiving an anonymousthread object. Anonymous objects are used, as theparent function doesn't need to access them anymore.doSomeWork is called, which throws an exception. Thisendsparent, but just before thatJoinGuard's destructor makes surethat the child-thread has been joined.std::thread the classstd::jthread can beused.Before usingjthread objects the<thread> header file must beincluded.
Objects of the classjthread act likethread objects, but ajthread thread automatically joins the thread that activatedjthread. Moreover, in some situationsjthread threads can directly be ended.
Once ajthread object receiving a function defining the thread's actionshas been constructed that function immediately starts as a separate thread.If that function ends by returning a value then that value is ignored. If thefunction throws an exception the program ends by callingstd::terminate. Alternatively, if the function should communicate a returnvalue or an exception to, e.g., the function starting thejthread astd::promise (cf. section (20.12)) can be used or it can modifyvariables which are shared with other threads (see also sections20.2and20.5).
The classjthread offers these constructors:
jthread() noexcept:jthread object that doesn'tstart a thread. It could be used as a data member of a class, allowing classobjects to start thejthread at some later point in time;explicit jthread(Function &&function, Args &&...args):function. The function receives as its firstargument the return value ofjthread's memberget_stop_token (seebelow), followed by theargs parameters (if present). Iffunction'sfirst argument is not astd::stop_token thenfunction, merelyreceiving theargs parameter values as its arguments. Arguments are passedtofunction with their proper types and values (see the example shownbelow at the description of thejthread memberrequest_stop;)jthread supports move construction and move assignment,but does not offer copy construction and copy assignment.The following members are available and operate like the identically namedstd::thread members. Refer to section20.1.2 for their descriptions:
void detach();id get_id() const noexcept;unsigned thread::hardware_concurrency() noexecptvoid join();bool joinable() const noexcept;native_handle_type native_handle();void swap(thread &other) noexcept.The following members are specific tojthread, allowing other threads toend the thread started byjthread:
std::stop_source get_stop_source() noexcept:jthread's std::stop_source.std::get_stop_token get_stop_token() const noexcept:jthread's std::stop_token.bool request_stop() noexcept:jthread object. The function operates atomically: it can be called from multiple threads without causing race conditions. It returnstrue if the stop request was successfully issued. It returnsfalse if a stop request has already been issued, which may also happen ifrequest_stop was issued by different threads, and another thread is still in the process of endingjthread's thread.request_stop thenstd::stop_callback functions (see the next section) that were registered for the thread's stop state are synchroneously called. If those callback functions throw exceptions thenstd::terminate is called. Also, any waiting condition variables that are associated with thejthread's stop state end their waiting states.Here is a short program illustratingrequest_stop:
1: #include <iostream> 2: #include <thread> 3: #include <chrono> 4: using namespace std; 5: 6: void fun(std::stop_token stop) 7: { 8: while (not stop.stop_requested()) 9: { 10: cout << "next\n"; 11: this_thread::sleep_for(1s); 12: } 13: } 14: 15: int main() 16: { 17: jthread thr(fun); 18: 19: this_thread::sleep_for(3s); 20: 21: thr.request_stop(); 22: 23: // thr.join() not required. 24: }jthread thread starts, receiving functionfun as its argument;fun defines astd::stop_token parameter,jthread will start that function. It performs (line 8) awhile loop that continues untilstop's stop_requested returnstrue. The loop itself shows a brief output line (line 10) followed by a one-second sleep (line 11);main function, having started the thread, sleeps for three seconds (line 19), and then (line 21) issues a stop-request, ending the thread.When running the program three lines containingnext are displayed.
std::stop_callback objects the<stop_token> header file must be included.In addition to merely ending thread functions viajthread's request_stopmember function it's also possible to associatecallbackfunctions withrequest_stop, which are executed whenrequest_stop iscalled. In situations where callback functions are registered when thethread function has already been stopped the callback functions areimmediately called when they are being registered (registering callbackfunctions is covered below).
Note that multiple callback functions can be registered. However, the order inwhich these callback functions are run once the thread is stopped is notdefined. Moreover, exceptions may not leave callback functions or the programends by callingstd::terminate.
Callback functions are registered by objects of the classstd::stop_callback. The classstop_callback offers the followingconstructors:
explicit stop_callback(std::stop_token const &st, Function &&cb) noexcept;explicit stop_callback(std::stop_token &&st, Function &&cb) noexcept;Notes:
Function can be the name of a (void) function without parameters or it can be an (anonymous or existing) object offering a parameter-less (void) function call operator. The functions do not necessarily have to be void functions, but their return values are ignored;noexcept is only used ifFunction is also declared asnoexcept (ifFunction is the name of a functor-class thennoexcept is used if its constructor is declared withnoexcept);stop_callback does not offer copy/move construction and assignment.Here is the example used in the previous section, this time defining acallback function. When running this program its output is
next next next stopFun called via stop_callback
1: void fun(std::stop_token stop) 2: { 3: while (not stop.stop_requested()) 4: { 5: cout << "next\n"; 6: this_thread::sleep_for(1s); 7: } 8: } 9: 10: void stopFun() 11: { 12: cout << "stopFun called via stop_callback\n"; 13: } 14: 15: int main() 16: { 17: jthread thr(fun); 18: 19: stop_callback sc{ thr.get_stop_token(), stopFun }; 20: 21: this_thread::sleep_for(3s); 22: 23: thr.request_stop(); 24: thr.join(); 25: }The functionfun is identical to the one shown in the previoussection, butmain defines (line 19) thestop_callback objectsc,passing itthr's get_stop_token's return value and the address of thefunctionstopFun, defined in lines 10 thru 13. In this case oncerequest_stop is called (line 23) the callback functionstopFun iscalled as well.
Before using mutexes the<mutex> header file must be included.
One of the key characteristics of multi-threaded programs is that threads mayshare data. Functions running as separate threads have access to all globaldata, and may also share the local data of their parent threads. However,unless proper measures are taken, this may easily result in data corruption,as illustrated by the following simulation of some steps that could beencountered in a multi-threaded program:
---------------------------------------------------------------------------Time step: Thread 1: var Thread 2: description--------------------------------------------------------------------------- 0 5 1 starts T1 active 2 writes var T1 commences writing 3 stopped Context switch 4 starts T2 active 5 writes var T2 commences writing 6 10 assigns 10 T2 writes 10 7 stopped Context switch 8 assigns 12 T1 writes 12 9 12----------------------------------------------------------------------------
In this example, threads 1 and 2 share variablevar, initially havingthe value 5. At step 1 thread 1 starts, and starts to write a value intovar. However, it is interrupted by a context switch, and thread 2 isstarted (step 4). Thread 2also wants to write a value intovar, andsucceeds until time step 7, when another context switch takes place. By nowvar is 10. However, thread 1 was also in the process of writing a valueintovar, and it is given a chance to complete its work: it assigns 12tovar in time step 8. Once time step 9 is reached, thread 2 proceeds onthe (erroneous) assumption thatvar must be equal to 10. Clearly, from thepoint of view of thread 2 its data have been corrupted.
In this case data corruption was caused by multiple threads accessing the samedata in an uncontrolled way. To prevent this from happening, access to shareddata should be protected in such a way that only one thread at a time mayaccess the shared data.
Mutexes are used to prevent the abovementioned kinds of problems byoffering a guarantee that data are only accessed by the thread that could lockthe mutex that is used to synchronize access to those data.
Exclusive data access completely depends on cooperation between thethreads. If thread 1 uses mutexes, but thread 2 doesn't, then thread 2 mayfreely access the common data. Of course that's bad practice, which should beavoided.
It is stressed that althoughusing mutexes is the programmer'sresponsibility, theirimplementation isn't: mutexes offer the necessaryatomic calls. When requesting a mutex-lock the thread is blocked (i.e., themutex statement does not return) until the lock has been obtained by therequesting thread.
Apart from the classstd::mutex the classstd::recursive_mutex is available. When arecursive_mutex is calledmultiple times by the same thread it increases its lock-count. Before otherthreads may access the protected data the recursive mutex must be unlockedagain that number of times. Moreover, the classesstd::timed_mutex andstd::recursive_timed_mutex are available. Their locks expire when released, but also after a certainamount of time.
The members of the mutex classes performatomic actions: no contextswitch occurs while they are active. So when two threads are trying tolock a mutex only one can succeed. In the above example: if both threadswould use a mutex to control access tovar thread 2 would not have beenable to assign 12 tovar, with thread 1 assuming that its value was 10. Wecould even have two threads running purely parallel (e.g., on two separatecores). E.g.:
-------------------------------------------------------------------------Time step: Thread 1: Thread 2: description------------------------------------------------------------------------- 1 starts starts T1 and T2 active 2 locks locks Both threads try to lock the mutex 3 blocks... obtains lock T2 obtains the lock, and T1 must wait 4 (blocked) processes var T2 processes var, T1 still blocked 5 obtains lock releases lock T2 releases the lock, and T1 immediately obtains the lock 6 processes var now T1 processes var 7 releases lock T1 also releases the lock-------------------------------------------------------------------------
Although mutexes can directly be used in programs, this rarely happens. It ismore common to embed mutex handling in locking classes that make sure that themutex is automatically unlocked again when the mutex lock is no longerneeded. Therefore, this section merely offers an overview of the interfaces ofthe mutex classes. Examples of their use will be given in the upcomingsections (e.g., section20.3).
All mutex classes offer the following constructors and members:
mutex() constexpr:constexpr constructor is the only availableconstructor;~mutex():unlock member;void lock():lock iscalled for a recursive mutex asystem_error is thrown if the threadalready owns the lock. Recursive mutexes increment their internallock count;bool try_lock() noexcept:true is returned, otherwisefalse. If thecalling thread already owns the locktrue is also returned, and in thiscase a recursive mutex also increments its internallock count;void unlock() noexcept:system_error is thrown if the thread does not own thelock. A recursive mutex decrements its interal lock count, releasingownership of the mutex once the lock count has decayed to zero;The timed-mutex classes (timed_mutex, recursive_timed_mutex) also offerthese members:
bool try_lock_for(chrono::duration<Rep, Period> const &relTime) noexcept:true is returned,otherwisefalse. If the calling thread already owns the locktrue isalso returned, and in this case a recursive timed mutex also increments itsinternallock count. TheRep andDuration types are inferred fromthe actualrelTime argument. E.g.,std::timed_mutex timedMutex;timedMutex.try_lock_for(chrono::seconds(5));
bool try_lock_until(chrono::time_point<Clock, Duration> const &absTime) noexcept:absTime has passed. If ownership is obtained,true is returned,otherwisefalse. If the calling thread already owns the locktrue isalso returned, and in this case a recursive timed mutex also increments itsinternallock count. TheClock andDuration types are inferredfrom the actualabsTime argument. E.g.,std::timed_mutex timedMutex;timedMutex.try_lock_until(chrono::system_clock::now() + chrono::seconds(5));
std::once_flag and thestd::call_once function, introduced in this section, the<mutex> header file must be included.In single threaded programs the initialization of global data does notnecessarily occur at one point in code. An example is the initialization ofthe object of a singleton class (cf.Gamma et al. (1995), Design Patterns,Addison-Wesley). Singleton classes may define a single static pointer datamemberSingleton *s_object, pointing to the singleton's object, and mayoffer a static memberinstance, implemented something like this:
Singleton &Singleton::instance() { return s_object ? s_object : (s_object = new Singleton); }With multi-threaded programs this approach immediately gets complex. Forexample, if two threads callinstance at the same time, whiles_objectstill equals 0, then both may callnew Singleton, resulting in onedynamically allocatedSingleton object becoming unreachable. Otherthreads, called afters_object was initialized for the first time, mayeither return a reference to that object, or may return a reference to theobject initialized by the second thread. Not exactly the expected behavior ofa singleton.
Mutexes (cf. section20.2) can be used to solve these kinds of problems,but they result in some overhead and inefficiency, as the mutex must beinspected at each call ofSingleton::instance.
When variables must dynamically be initialized, and the initialization shouldtake place only once thestd::once_flag type and thestd::call_oncefunction should be used.
Thecall_once function expects two or three arguments:
once_flag variable, keeping track of theactual initialization status. Thecall_once function simply returns iftheonce_flag indicates that initialization already took place;call_once's third argument.instance function cannow easily be designed (using in-class implementations for brevity): class Singleton { static std::once_flag s_once; static Singleton *s_singleton; ... public: static Singleton *instance() { std::call_once(s_once, []{s_singleton = new Singleton;} ); return s_singleton; } ... };However, there are additional ways to initialize data, even for multi-threaded programs:
constexprkeyword (cf. section8.1.4.1), satisfying the requirements for constantinitialization. In this case, a static object, initialized using thatconstructor, is guaranteed to be initialized before any code is run as part ofthe static initialization phase. This is used bystd::mutex, as iteliminates the possibility of race conditions when global mutexes areinitialized. #include <iostream> struct Cons { Cons() { std::cout << "Cons called\n"; } }; void called(char const *time) { std::cout << time << "time called() activated\n"; static Cons cons; } int main() { std::cout << "Pre-1\n"; called("first"); called("second"); std::cout << "Pre-2\n"; Cons cons; } /* Displays: Pre-1 firsttime called() activated Cons called secondtime called() activated Pre-2 Cons called */ This feature causes a thread to wait automatically if another thread isstill initializing the static data (note thatnon-static data never causeproblems, as non-static local variables only exist within their own thread ofexecution).std::shared_mutex) are available afterincluding the<shared_mutex> header file. Shared mutex types behave liketimed_mutex types and optionally have the characteristics described below.The classshared_mutex provides a non-recursive mutex with sharedownership semantics, comparable to, e.g., theshared_ptr type.A program usingshared_mutexes is undefined if:
shared_mutex;shared_mutex.Shared mutex types provide a shared lock ownership mode. Multiple threads cansimultaneously hold a shared lock ownership of ashared_mutex type ofobject. But no thread can hold a shared lock while another thread holds anexclusive lock on the sameshared_mutex object, and vice-versa.
Shared mutexes are useful in situations where multiple threads (consumers)want to access information for reading: the consumers don't want to change thedata, but merely want to retrieve them. At some point another thread (theproducer) wants to modify the data. At that point the producer requestsexclusive access to the data, and is forced to wait until all consumers havereleased their locks. While the producer waits for the exclusive lock, newconsumers' requests for shared locks remain pending until the producer hasreleased the exclusive lock. Thus, reading is possible for many threads, butfor writing the exclusive lock guarantees that no other threads can access thedata.
The typeshared_mutex offers the following members providing shared lockownership. To obtain exclusive ownership omit the_shared from thefollowing member functions:
void lock_shared():void unlock_shared():bool try_lock_shared():try_lock_shared immediately returns. Returnstrue if the sharedownership lock was acquired,false otherwise. An implementation may failto obtain the lock even if it is not held by any other thread. Initially thecalling thread may not yet own the mutex;bool try_lock_shared_for(rel_time):rel_time. If the time specified byrel_time is less than or equal torel_time.zero(), the member attemptsto obtain ownership without blocking (as if by callingtry_lock_shared()). The member shall return within the time intervalspecified byrel_time only if it has obtained shared ownership of the mutexobject. Returnstrue if the shared ownership lock was acquired,falseotherwise. Initially the calling thread may not yet own the mutex;bool try_lock_shared_until(abs_time):abs_time has passed. If the time specified byabs_time has already passed then the member attempts to obtain ownershipwithout blocking (as if by callingtry_lock_shared()). Returnstrueif the shared ownership lock was acquired,false otherwise. Initially thecalling thread may not yet own the mutex;<mutex> header file must be included.Whenever threads share data, and at least one of the threads may change commondata, mutexes should be used to prevent threads from using the same datasynchronously.
Usually locks are released at the end of action blocks. This requires explicitcalls to the mutexes'unlock function, which introduces comparableproblems as we've seen with the thread'sjoin member.
To simplify locking and unlocking two mutex wrapper classes are available:
std::lock_guard:unlock of the mutexes they control;std::unique_lock:lock_guard;The classlock_guard offers a limited, but useful interface:
lock_guard<Mutex>(Mutex &mutex):lock_guard object the mutex type (e.g.,std::mutex, std::timed_mutex, std::shared_mutex) is specified, and a mutexof the indicated type is provided as its argument. The construction blocksuntil thelock_guard object owns the lock. Thelock_guard's destructorautomatically releases the mutex lock.lock_guard<Mutex>(Mutex &mutex, std::adopt_lock_t):lock_guard. The mutex lock is released again by thelock_guard's destructor. At construction time the mutex must already beowned by the calling thread. Here is an illustration of how it can be used: 1: void threadAction(std::mutex &mut, int &sharedInt) 2: { 3: std::lock_guard<std::mutex> lg{mut, std::adopt_lock_t()}; 4: // do something with sharedInt 5: }threadAction receives a reference to amutex. Assume the mutex owns the lock;lock_guard. Eventhough we don't explicitly use thelock_guard object, an object should bedefined to prevent the compiler from destroying an anonymous object before thefunction ends;lock_guard's destructor.mutex_type:lock_guard<Mutex> types also define the typemutex_type: it is asynonym of theMutex type that is passed to thelock_guard'sconstructor.Here is a simple example of a multi-threaded program usinglock_guardsto prevent information inserted intocout from getting mixed.
bool oneLine(istream &in, mutex &mut, int nr) { lock_guard<mutex> lg(mut); string line; if (not getline(in, line)) return false; cout << nr << ": " << line << endl; return true; } void io(istream &in, mutex &mut, int nr) { while (oneLine(in, mut, nr)) this_thread::yield(); } int main(int argc, char **argv) { ifstream in(argv[1]); mutex ioMutex; thread t1(io, ref(in), ref(ioMutex), 1); thread t2(io, ref(in), ref(ioMutex), 2); thread t3(io, ref(in), ref(ioMutex), 3); t1.join(); t2.join(); t3.join(); }As withlock_guard, a mutex-type must be specified when definingobjects of the classstd::unique_lock. The classunique_lock is much more elaborate than the basiclock_guard classtemplate. Its interface does not define a copy constructor or overloadedassignment operator, but itdoes define a move constructor and a moveassignment operator. In the following overview ofunique_lock's interfaceMutex refers to the mutex-type that is specified when defining aunique_lock:
unique_lock() noexcept:mutexobject. It must be assigned amutex (e.g., using move-assignment) beforeit can do anything useful;explicit unique_lock(Mutex &mutex):unique_lock with an existingMutex object, andcallsmutex.lock();unique_lock(Mutex &mutex, defer_lock_t) noexcept:unique_lock with an existingMutex object, butdoes not callmutex.lock(). Call it by passing adefer_lock_t objectas the constructor's second argument, e.g.,unique_lock<mutex> ul(mutexObj, defer_lock_t())
unique_lock(Mutex &mutex, try_to_lock_t) noexcept:unique_lock with an existingMutex object, andcallsmutex.try_lock(): the constructor won't block if the mutex cannot belocked;unique_lock(Mutex &mutex, adopt_lock_t) noexcept:unique_lock with an existingMutex object,and assumes that the current thread has already locked the mutex;unique_lock(Mutex &mutex, chrono::duration<Rep, Period> const &relTime) noexcept:Mutex object bycallingmutex.try_lock_for(relTime). The specified mutex type musttherefore support this member (e.g., it is astd::timed_mutex). It couldbe called like this:std::unique_lock<std::timed_mutex> ulock(timedMutex, std::chrono::seconds(5));
unique_lock(Mutex &mutex, chrono::time_point<Clock, Duration> const &absTime) noexcept:Mutex object bycallingmutex.try_lock_until(absTime). The specified mutex type musttherefore support this member (e.g., it is astd::timed_mutex). This constructor could be called like this:std::unique_lock<std::timed_mutex> ulock( timedMutex, std::chrono::system_clock::now() + std::chrono::seconds(5) );
void lock():unique_lock is obtained. If no mutex is currently managed, then asystem_error exception is thrown.Mutex *mutex() const noexcept:unique_lock (anullptr is returned if no mutex object is currentlyassociated with theunique_lock object.)explicit operator bool() const noexcept:true if theunique_lock owns a locked mutex, otherwisefalse is returned;unique_lock& operator=(unique_lock &&tmp) noexcept:unlock member, whereaftertmp's state is transferred to the left-handoperand;bool owns_lock() const noexcept:true if theunique_lock owns the mutex, otherwisefalse is returned;Mutex *release() noexcept:unique_lock object, discarding that association;void swap(unique_lock& other) noexcept:unique_lock andother;bool try_lock():unique_lock, returningtrue if this succeeds, andfalseotherwise. If no mutex is currently associated with theunique_lockobject, then asystem_error exception is thrown;bool try_lock_for(chrono::duration<Rep, Period> const &relTime):Mutex objectmanaged by theunique_lock object by calling the mutex'stry_lock_for(relTime) member. The specified mutex type must thereforesupport this member (e.g., it is astd::timed_mutex);bool try_lock_until(chrono::time_point<Clock, Duration> const &absTime):Mutex objectmanaged by theunique_lock object by calling the mutex'smutex.try_lock_until(absTime) member. The specified mutex type musttherefore support this member (e.g., it is astd::timed_mutex);void unlock():system_error exception is thrown if theunique_lock object does notown the mutex.In addition to the members of the classesstd::lock_guard andstd::unique_lock the functionsstd::lock andstd::try_lock are available. These functions can beused to preventdeadlocks, the topic of the next section.
lock_guards is defining it as an anonymousobject: void Class::notLocked() { lock_guard<mutex>{d_mutex}; // using data available to multiple threads } In cases like these, since thelock_guard is defined as an anonymousobject, it's immediately destroyed after its construction offering no guardagainst multiple threads using the shared data.Traditionally this situation is solved by explicitly defining an object. Theobject's name is irrelevant, because it's used nowhere else, resulting inconstructions like
void Class::lockedOK() { lock_guard<mutex> guard{d_mutex}; // using data available to multiple threads } But in this context the name's irrelevant and not used nowhere elsew inthe function.Since the C++26 standard, however, a generalized alternative approachis available. It's calledname-independent declaration Very simple (and broadly applicable), requiring--std=c++26 orbeyond, which is supported sinceg++-14.
The 'name'_ (a single underscore) results in a name-independentdeclaration. It's definitelynot a name we would use for 'common'variables, but starting with the C++26 standard a variable named `_' implies aname-independent declaration. So in 'lockedOK' we can now do:
void Class::lockedOK() { lock_guard<mutex> _{d_mutex}; // using data available to multiple threads } and we never have to think again about how to name a required (but notused by us) variable or object. As long as there's no ambiguity it's evenpossible to define multiple `_' variables or objects. As long as they're notbeing used by name it's possible to define, e.g., void neverUsed() { int _{12}; int _{43}; // a different int because of the initialization auto _ = 42; // 'auto' also works fine auto _("hello world"); // a string... } But in practice, use name independent declarations as illustrated in theabovelockedOK function.std::lock andstd::try_lock functions that canbe used to help preventing such situations.Before these functions can be used the<mutex> header file must beincluded
In the following overviewL1 &l1, ... represents one or morereferences to objects of lockable types:
void std::lock(L1 &l1, ...):li objects. If a lock could not be obtained for at least one of the objects, then all locks obtained so far are relased, even if the object for which no lock could be obtained threw an exception;int std::try_lock(L1 &l1, ...):try_lock members. If all locks could be obtained, then -1 is returned. Otherwise the (0-based) index of the first argument which could not be locked is returned, releasing all previously obtained locks.As an example consider the following little multi-threaded program: Thethreads use mutexes to obtain unique access tocout and to anintvalue. However,fun1 first lockscout (line 7), and thenvalue(line 10);fun2 first locksvalue (line 16) and thencout (line19). Clearly, iffun1 has lockedcoutfun2 can't obtain the lockuntilfun1 has released it. Unfortunately,fun2 has lockedvalue,and the functions only release their locks when returning. But in order toaccess the information invaluefun1 it must have obtained a lock onvalue, which it can't, asfun2 has already lockedvalue: thethreads are waiting for each other, and neither thread gives in.
1: 2: int value; 3: mutex valueMutex; 4: mutex coutMutex; 5: 6: void fun1() 7: { 8: lock_guard<mutex> lg1(coutMutex); 9: cout << "fun 1 locks cout\n"; 10: 11: lock_guard<mutex> lg2(valueMutex); 12: cout << "fun 1 locks value\n"; 13: } 14: 15: void fun2() 16: { 17: lock_guard<mutex> lg1(valueMutex); 18: cerr << "fun 2 locks value\n"; 19: 20: lock_guard<mutex> lg2(coutMutex); 21: cout << "fun 2 locks cout\n"; 22: } 23: 24: int main() 25: { 26: thread t1(fun1); 27: fun2(); 28: t1.join(); 29: } 30:A good recipe for avoiding deadlocks is to prevent nested (or multiple) mutexlock calls. But if multiple mutexes must be used, always obtain the locks inthe same order. Rather than doing this yourself,std::lock andstd::try_lock should be used whenever possible to obtain multiple mutexlocks. These functions accept multiple arguments, which must be lockable typeslikelock_guard, unique_lock, or even a plainmutex. The previousdeadlocking program, can be modified to callstd::lock to lock bothmutexes. In this example using one single mutex would also work, but themodified program now looks as similar as possible to the previousprogram. Note how in lines 10 and 21 a differentordering of theunique_locks arguments was used: it is not necessary touse an identical argument order when callingstd::lock orstd::try_lock.
1: int value; 2: mutex valueMutex; 3: mutex coutMutex; 4: 5: void fun1() 6: { 7: scoped_lock sl{ coutMutex, valueMutex }; 8: cout << "fun 1 locks cout\n"; 9: sleep(1); 10: cout << "fun 1 locks value\n"; 11: } 12: 13: void fun2() 14: { 15: scoped_lock sl{ valueMutex, coutMutex }; 16: cout << "fun 2 locks value\n"; 17: sleep(1); 18: cout << "fun 2 locks cout\n"; 19: } 20: 21: int main() 22: { 23: thread t1(fun1); 24: fun2(); 25: t1.join(); 26: } 27: // Displays: 28: // fun 2 locks value 29: // fun 2 locks cout 30: // fun 1 locks cout 31: // fun 1 locks valuestd::shared_lock, after including the<shared_mutex> header file.An object of the typestd::shared_lock controls the shared ownership of alockable object within a scope. Shared ownership of the lockable object may beacquired at construction time or thereafter, and once acquired, it may betransferred to anothershared_lock object. Objects of typeshared_lockcannot be copied, but move construction and assignment is supported.
The behavior of a program is undefined if the contained pointer to a mutex(pm) has a non-zero value and the lockable object pointed to bypm doesnot exist for the entire remaining lifetime of theshared_lockobject. The supplied mutex type must be ashared_mutex or a type havingthe same characteristics.
The typeshared_lock offers the following constructors, destructor andoperators:
shared_lock() noexcept:shared_lock which is not ownedby a thread and for whichpm == 0;explicit shared_lock(mutex_type &mut):mut.lock_shared(). Thecalling thread may not already own the lock. Following the constructionpm== &mut, and the lock is owned by the current thread;shared_lock(mutex_type &mut, defer_lock_t) noexcept:pm to&mut, but the calling thread does not own the lock;shared_lock(mutex_type &mut, try_to_lock_t):mut.try_lock_shared(). The calling thread may not already own thelock. Following the constructionpm == &mut, and the lock may or may notbe owned by current thread, depending on the return value oftry_lock_shared;shared_lock(mutex_type &mut, adopt_lock_t):pm == &mut, and thelock is owned by the current thread;shared_lock(mutex_type &mut, chrono::time_point<Clock, Duration> const &abs_time):Clock andDuration are types specifying a clock and absolute time (cf. section4.2). It can be called if the calling thread does not already ownthe mutex. It callsmut.try_lock_shared_until(abs_time). Following theconstructionpm == &mut, and the lock may or may not be owned by currentthread, depending on the return value oftry_lock_shared_until;shared_lock(mutex_type &mut, chrono::duration<Rep, Period> const &rel_time):Clock andPeriod are types specifying a clock and relative time (cf. section4.2). It can be called if the calling thread does not already ownthe mutex. It callsmut.try_lock_shared_for(abs_time). Following theconstructionpm == &mut, and the lock may or may not be owned by currentthread, depending on the return value oftry_lock_shared_for;shared_lock(shared_lock &&tmp) noexcept:tmp to thenewly constructedshared_lock. Following the constructiontmp.pm == 0andtmp no longer owns the lock;~shared_lock():pm->unlock_shared() is called;shared_lock &operator=(shared_lock &&tmp) noexcept (The move assignment operator callspm->unlock_shared and thentransfers the information intmp to thecurrentshared_lock object. Following thistmp.pm == 0andtmp no longer owns the lock;)explicit operator bool () const noexcept:shared_lock object owns the lock.The following members are provided:
void lock():pm->lock_shared(), after which the current tread owns theshared lock. Exceptions may be thrown fromlock_shared, and otherwise ifpm == 0 or if the current thread already owns the lock;mutex_type *mutex() const noexcept:pm;mutex_type *release() noexcept:pm, which is equal to zero aftercalling this member. Also, the current object no longer owns the lock;void swap(shared_lock &other) noexcept:othershared_lockobjects. There is also a free memberswap, a function template, swappingtwoshared_lock<Mutex> objects, whereMutex represents the mutex typefor which the shared lock objects were instantiated:voidswap(shared_lock<Mutex> &one, shared_lock<Mutex> &two) noexcept;bool try_lock():pm->try_lock_shared(), returning this call's return value.Exceptions may be thrown fromtry_lock_shared, andotherwise ifpm == 0 or if the current thread already owns the lock;bool try_lock_for(const chrono::duration<Rep, Period>&rel_time):Clock andPeriod are types specifying aclock and relative time (cf. section4.2). It callsmut.try_lock_shared_for(abs_time). Following the call the lock may or maynot be owned by current thread, depending on the return value oftry_lock_shared_until. Exceptions may be thrown fromtry_lock_shared_for, and otherwise ifpm == 0 or if the current threadalready owns the lock;bool try_lock_until(const chrono::time_point<Clock,Duration>& abs_time):Clock andDuration are types specifyinga clock and absolute time (cf. section4.2). It callsmut.try_lock_shared_until(abs_time), returning its return value. Followingthe call the lock may or may not be owned by current thread, depending on thereturn value oftry_lock_shared_until. Exceptions may be thrown fromtry_lock_shared_until, and otherwise ifpm == 0 or if the currentthread already owns the lock;void unlock():scoped_lock can be used to lock multiple semaphores at once,where thescoped_lock ensures that deadlocks are avoided.Thescoped_lock also has a default constructor, performing no actions, soit's up to the software engineer to definescoped_lock objects with atleast onemutex. Before usingscoped_lock objects the<mutex>header file must be included. Adapting the example from section20.3.2: both functions define ascoped_lock (note that the orderin which the mutexes are specified isn't relevant), and deadlocks are do notoccur:
1: 2: int value; 3: mutex valueMutex; 4: mutex coutMutex; 5: 6: void fun1() 7: { 8: unique_lock<mutex> lg1(coutMutex, defer_lock); 9: unique_lock<mutex> lg2(valueMutex, defer_lock); 10: 11: lock(lg1, lg2); 12: 13: cout << "fun 1 locks cout\n"; 14: cout << "fun 1 locks value\n"; 15: } 16: 17: void fun2() 18: { 19: unique_lock<mutex> lg1(coutMutex, defer_lock); 20: unique_lock<mutex> lg2(valueMutex, defer_lock); 21: 22: lock(lg2, lg1); 23: 24: cout << "fun 2 locks cout\n"; 25: cout << "fun 2 locks value\n"; 26: } 27: 28: int main() 29: { 30: thread t1(fun1); 31: thread t2(fun2); 32: t1.join(); 33: t2.join(); 34: } 35:Thus, instead of usinglock_guard objects,scoped_lock objects can beused. It's a matter of taste whetherlock_guards orscoped_locksshould be preferred when only one mutex is used. Maybescoped_lock shouldbe preferred, since it always works....
Before condition variables can be used the<condition_variable> headerfile must be included.
To start our discussion, consider a classic producer-consumer scenario: theproducer generates items which are consumed by a consumer. The producer canonly produce a certain number of items before its storage capacity has filledup and the client cannot consume more items than the producer has produced.
At some point the producer's storage capacity has filled to the brim, and theproducer has to wait until the client has at least consumed some items,thereby creating space in the producer's storage. Similarly, the consumercannot start consuming until the producer has at least produced some items.
Implementing this scenario only using mutexes (data locking) is not anattractive option, as merely using mutexes forces a program to implement thescenario usingpolling: processes must continuously (re)acquire themutex's lock, determine whether they can perform some action, followed by therelease of the lock. Often there's no action to perform, and the process isbusy acquiring and releasing the mutex's lock. Polling forces threads to waituntil they can lock the mutex, even though continuation might already bepossible. The polling interval could be reduced, but that too isn't anattractive option, as that increases the overhead associated with handling themutexes (also called `busy waiting').
Condition variables can be used to prevent polling. Threads can use conditionvariables tonotify waiting threads that there is something for them todo. This way threads can synchronize on data values (states).
As data values may be modified by multiple threads, threads still need to usemutexes, but only for controlling access to the data. In addition, conditionvariables allow threads torelease ownership of mutexes until a certainvalue has been obtained, until a preset amount of time has been passed, oruntil a preset point in time has been reached.
The prototypical setup of threads using condition variables looks like this:
lock the mutex while the required condition has not yet been attained (i.e., is false): wait until being notified (this automatically releasing the mutex's lock). once the mutex's lock has been reacquired, and the required condition has been attained: process the data release the mutex's lock.
lock the mutex while the required condition has not yet been attained: do something to attain the required condition notify waiting threads (that the required condition has been attained) release the mutex's lock.
This protocol hides a subtle initial synchronization requirement. The consumerwill miss the producer's notification if it (i.e., the consumer) hasn't yetentered its waiting state. Sowaiting (consumer) threads should startbefore notifying (producer) threads. Once threads have started, noassumptions can be made anymore about the order in which any of the conditionvariable's members (notify_one, notify_all, wait, wait_for, andwait_until) are called.
Condition variables come in two flavors: objects of the classstd::condition_variable are used in combinationwith objects of typeunique_lock<mutex>. Because of optimizations which are available for this specific combination usingcondition_variables is somewhat more efficient than using the moregenerally applicable classstd::condition_variable_any, which may beused with any (e.g., user supplied) lock type.
Condition variable classes (covered in detail in the next two sections) offermembers likewait, wait_for, wait_until, notify_one andnotify_allthat may concurrently be called. The notifying members are always atomicallyexecuted. Execution of thewait members consists of three atomic parts:
wait call).wait-members the previously waiting thread has reacquired the mutex's lock.In addition to the condition variable classes the following free function andenum type is provided:
void std::notify_all_at_thread_exit(condition_variable &cond, unique_lock<mutex> lockObject):cond are notified. It is good practice to exit the thread as soon as possible after callingnotify_all_at_thread_exit.Waiting threads must verify that the thread they were waiting for has indeed ended. This is usually realized by first obtaining the lock onlockObject, followed by verifying that the condition they were waiting for is true and that the lock was not reacquired beforenotify_all_at_thread_exit was called.
std::cv_status:cv_status enum is used by several member functions of the condition variable classes (cf. sections20.4.1 and20.4.2):namespace std{ enum class cv_status { no_timeout, timeout };}std::condition_variable merely offers adefault constructor. No copy constructor or overloaded assignment operator isprovided.Before using the classcondition_variable the<condition_variable>header file must be included.
The class's destructor requires that no thread is blocked by the threaddestroying thecondition_variable. So all threads waiting on acondition_variable must be notified before acondition_variableobject's lifetime ends. Callingnotify_all (see below) before acondition_variable's lifetime ends takes care of that, as thecondition_variable's thread releases its lock of themutex variable,allowing one of the notified threads to lock the mutex.
In the following member-descriptions a typePredicate indicates that aprovidedPredicate argument can be called as a function without arguments,returning abool. Also, other member functions are frequently referredto. It is tacitly assumed that all member referred to below were called usingthe same condition variable object.
The classcondition_variable supports severalwait members, whichblock the thread until notified by another thread (or after a configurablewaiting time). However,wait members may also spuriously unblock, withouthaving reacquired the lock. Therefore, returning fromwait members threadsshould verify that the required condition is actually true. If not,again callingwait may be appropriate. The next piece of pseudo codeillustrates this scheme:
while (conditionNotTrue()) condVariable.wait(&uniqueLock);
The classcondition_variable's members are:
void notify_one() noexcept:wait member called by other threads returns. Which one actually returns cannot be predicted.void notify_all() noexcept:wait members called by other threads unblock their wait states. Of course, only one of them will subsequently succeed in reacquiring the condition variable's lock object.void wait(unique_lock<mutex>& uniqueLock):wait the current thread must have acquired the lock ofuniqueLock. Callingwait releases the lock, and the current thread is blocked until it has received a notification from another thread, and has reacquired the lock.void wait(unique_lock<mutex>& uniqueLock, Predicate pred):template <typename Predicate>. The template's type is automatically derived from the function's argument type and does not have to be specified explicitly.Before callingwait the current thread must have acquired the lock ofuniqueLock. As long as `pred' returnsfalsewait(lock) is called.
cv_status wait_for(unique_lock<mutex> &uniqueLock, std::chrono::duration<Rep, Period> const &relTime):template <typename Rep, typename Period>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly. E.g., to wait for at most 5 secondswait_for can be called like this:cond.wait_for(&unique_lock, std::chrono::seconds(5));
This member returns when being notified or when the time interval specified byrelTime has passed.
When returning due to a timeout,std::cv_status::timeout is returned, otherwisestd::cv_status::no_timeout is returned.
Threads should verify that the required data condition has been met afterwait_for has returned.
bool wait_for(unique_lock<mutex> &uniqueLock, chrono::duration<Rep, Period> const &relTime, Predicate pred):template <typename Rep, typename Period, typename Predicate>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly.As long aspred returns false, the previouswait_for member is called. If the previous member returnscv_status::timeout, thenpred is returned, otherwisetrue.
cv_status wait_until(unique_lock<mutex>& uniqueLock, chrono::time_point<Clock, Duration> const &absTime):template <typename Clock, typename Duration>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly. E.g., to wait until 5 minutes after the current timewait_until can be called like this:cond.wait_until(&unique_lock, chrono::system_clock::now() + std::chrono::minutes(5));
This function acts identically to thewait_for(unique_lock<mutex> &uniqueLock, chrono::duration<Rep, Period> const &relTime) member described earlier, but uses an absolute point in time, rather than a relative time specification.
This member returns when being notified or when the time interval specified byrelTime has passed. When returning due to a timeout,std::cv_status::timeout is returned, otherwisestd::cv_status::no_timeout is returned.
bool wait_until(unique_lock<mutex> &lock, chrono::time_point<Clock, Duration> const &absTime, Predicate pred):template <typename Clock, typename Duration, typename Predicate>. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly.As long aspred returns false, the previouswait_until member is called. If the previous member returnscv_status::timeout, thenpred is returned, otherwisetrue.
true whenwait-members of condition variables return.condition_variable the classstd::condition_variable_any can be used withany (e.g., user supplied) lock type, and not just with the stl-providedunique_lock<mutex>.Before using the classcondition_variable_any the<condition_variable>header file must be included.
The functionality that is offered bycondition_variable_any is identicalto the functionality offered by the classcondition_variable, albeit thatthe lock-type that is used bycondition_variable_any is notpredefined. The classcondition_variable_any therefore requires thespecification of the lock-type that must be used by its objects.
In the interface shown below this lock-type is referred to asLock. Most ofcondition_variable_any's members are defined asmember templates, defining aLock type as one of its parameters. Therequirements of these lock-types are identical to those of the stl-providedunique_lock, and user-defined lock-type implementations should provide atleast the interface and semantics that is also provided byunique_lock.
This section merely presents the interface of the classcondition_variable_any. As its interface offers the same members ascondition_variable (allowing, where applicable, passing any lock-typeinstead of justunique_lock to corresponding members), the reader isreferred to the previous section for a description of the semantics of theclass members.
Likecondition_variable, the classcondition_variable_any onlyoffers a default constructor. No copy constructor or overloaded assignmentoperator is provided.
Also, likecondition_variable, the class's destructor requires that nothread is blocked by the current thread. This implies that all other (waiting)threads must have been notified; those threads may, however, subsequentlyblock on the lock specified in theirwait calls.
Note that, in addition toLock, the typesClock, Duration, Period,Predicate, andRep are template types, defined just like the identicallynamed types mentioned in the previous section.
Assuming thatMyMutex is a user defined mutex type, and thatMyLock isa user defined lock-type (cf. section20.3 for details aboutlock-types), then acondition_variable_any object can be defined and usedlike this:
MyMutex mut; MyLock<MyMutex> ul(mut); condition_variable_any cva; cva.wait(ul);
These are the classcondition_variable_any's members:
void notify_one() noexcept;void notify_all() noexcept;void wait(Lock& lock);void wait(Lock& lock, Predicate pred);cv_status wait_until(Lock& lock, const chrono::time_point<Clock, Duration>& absTime);bool wait_until(Lock& lock, const chrono::time_point<Clock, Duration>& absTime, Predicate pred);cv_status wait_for(Lock& lock, const chrono::duration<Rep, Period>& relTime);bool wait_for(Lock& lock, const chrono::duration<Rep, Period>& relTime,)Predicate pred;consumer loop: - wait until there's an item in store, then reduce the number of stored items - remove the item from the store - increment the number of available storage locations - do something with the retrieved item producer loop: - produce the next item - wait until there's room to store the item, then reduce the number of available storage locations - store the item - increment the number of stored items
It is important that the two storage administrative tasks (registering thenumber of available items and available storage locations) are eitherperformed by the client or by the producer. For the consumer `waiting' means:
Semaphore, offering memberswait andnotify_all. For a more extensive discussion of semaphores seeTanenbaum, A.S. (2016)Structured Computer Organization, Pearson Prentice-Hall.As a brief summary: semaphores restrict the number of threads that can accessa resource of limited size. It ensures that the number of threads that additems to the resource (the producers) can never exceed the resource's maximumsize, or it ensures that the number of threads that retrieve items from theresource (the consumers) can never exceed the resource's current size. Thus,in a producer/consumer design two semaphores are used: one to control accessto the resource by the producers, and one to control access to the resource bythe consumers.
For example, say we have ten producing threads, as well as ten consumers, anda lockable queue that must not grow bigger than 1000 items.Producers try to push one item at a time; consumers try to pop one.
The data member containing the actual count is calledd_available. It isprotected bymutex d_mutex. In addition acondition_variabled_condition is defined:
mutable std::mutex d_mutex; // mutable because of its use in // 'size_t size() const' std::condition_variable d_condition; size_t d_available;
The waiting process is implemented through its member functionwait:
1: void Semaphore::wait() 2: { 3: std::unique_lock<std::mutex> lk(d_mutex); // get the lock 4: while (d_available == 0) 5: d_condition.wait(lk); // internally releases the lock 6: // and waits, on exit 7: // acquires the lock again 8: --d_available; // dec. available 9: } // the lock is released In line 5d_condition.wait releases the lock. It waits until receivinga notification, and re-acquires the lock just before returning. Consequently,wait's code always has complete and unique control overd_available.What about notifying a waiting thread? This is handled in lines 4 and5 of the member functionnotify_all:
1: void Semaphore::notify_all() 2: { 3: std::lock_guard<std::mutex> lk(d_mutex); // get the lock 4: if (d_available++ == 0) 5: d_condition.notify_all(); // use notify_one to notify one other 6: // thread 7: } // the lock is released At line 4d_available is always incremented; by using a postfixincrement it can simultaneously be tested for being zero. If it was initiallyzero thend_available is now one. A thread waiting untild_availableexceeds zero may now continue. A waiting thread is notified by callingd_condition.notify_one. In situations where multiple threads are waiting`notify_all' can also be used.Using the facilities of the classSemaphore whose constructor expectsan initial value of itsd_available data member, the classicconsumer-producer paradigm can now be implemented usingmulti-threading (A more elaborate example of the producer-consumerprogram is found in theyo/threading/examples/events.cc file in theC++ Annotations's source archive):
Semaphore available(10); Semaphore filled(0); std::queue<size_t> itemQueue; std::mutex qMutex; void consumer() { while (true) { filled.wait(); // mutex lock the queue: { std::lock_guard lg(qMutex); size_t item = itemQueue.front(); itemQueue.pop(); } available.notify_all(); process(item); // not implemented here } } void producer() { size_t item = 0; while (true) { ++item; available.wait(); // mutex lock the queue with multiple consumers { std::lock_guard lg(qMutex); itemQueue.push(item); } filled.notify_all(); } } int main() { thread consume(consumer); thread produce(producer); consume.join(); produce.join(); } Note that amutex is used to avoid simultaneous access to the queue bymultiple threads. Consider the situation where the queue contains 5 items: inthat situation the semaphores allow the consumer and the producer to accessthe queue, but to avoid currupting the queue only one of them may modify thequeue at a time. This is realized by both threads obtaining thestd::mutexqMutex lock before modifying the queue.<atomic> headerfile must be included.When data are shared among multiple threads, data corruption is usuallyprevented using mutexes. To increment a simpleint using this strategycode as shown below is commonly used:
{ lock_guard<mutex> lk{ intVarMutex }; ++intVar; }The compound statement is used to limit thelock_guard's lifetime, sothatintVar is only locked for a short little while.
This scheme is not complex, but at the end of the day having to define alock_guard for every single use of a simple variable, and having to definea matching mutex for each simple variable is a bit annoying and cumbersome.
C++ offers a way out through the use ofatomic data types.Atomic data types are available for all basic types, and also for (trivial)user defined types. Trivial types are (see also section23.6.2) allscalar types, arrays of elements of a trivial type, and classes whoseconstructors, copy constructors, and destructors all have defaultimplementations, and their non-static data members are themselves of trivialtypes.
The class templatestd::atomic<Type> is available for allbuilt-in types, including pointer types. E.g.,std::atomic<bool> definesan atomicbool type. For many types alternative somewhat shortertype names are available. E.g, instead ofstd::atomic<unsigned short> thetypestd::atomic_ushort can be used. Refer to theatomic header filefor a complete list of alternate names.
IfTrivial is a user-defined trivial type thenstd::atomic<Trivial>defines an atomic variant ofTrivial: such a type does not require a separatemutex to synchronize access by multiple threads.
Objects of the class templatestd::atomic<Type> cannot directly be copiedor assigned to each other. However, they can be initialized by values of typeType, and values of typeType can also directly be assigned tostd::atomic<Type> objects. Moreover, sinceatomic<Type> types offerconversion operators returning theirType values, anatomic<Type>objects can also be assigned to or initialized by anotheratomic<Type>object using astatic_cast:
atomic<int> a1 = 5; atomic<int> a2{ static_cast<int>(a1) };The classstd::atomic<Type> provides several public members, shownbelow. Non-member (free) functions operating onatomic<Type> objects arealso available.
Thestd::memory_order enumeration defines the following symbolicconstants, which are used to specify ordering constraints of atomic operations:
memory_order_acq_rel: the operation must be a read-modify-write operation, combiningmemory_order_acquire andmemory_order_release;memory_order_acquire: the operation is an acquire operation. It synchronizes with a release operation that wrote the same memory location;memory_order_consume: the operation is a consume operation on the involved memory location;memory_order_relaxed: no ordering constraints are provided by the operation;memory_order_release: the operation is a release operation. It synchronizes with acquire operations on the same location;memory_order_sec_cst: the default memory order specification for all operations. Memory storing operations usememory_order_release, memory load operations usememory_order_acquire, and read-modify-write operations usememory_order_acq_rel.The memory order cannot be specified for the overloaded operators provided byatomic<Type>. Otherwise, mostatomic member functions may also begiven a finalmemory_order argument. Where this is not available it isexplictly mentioned at the function's description.
Here are the standard availablestd::atomic<Type> member functions:
bool compare_exchange_strong(Type ¤tValue, Type newValue) noexcept:newValue using byte-wise comparisons. If equal (andtrue is returned) thennewValue is stored in the atomic object; if unequal (andfalse is returned) the object's current value is stored incurrentValue;bool compare_exchange_weak(Type &oldValue, Type newValue) noexcept:newValue using byte-wise comparisons. If equal (andtrue is returned), thennewValue is stored in the atomic object; if unequal, ornewValue cannot be atomically assigned to the current objectfalse is returned and the object's current value is stored incurrentValue;Type exchange(Type newValue) noexcept:newValue is assigned to the current object;bool is_lock_free() const noexept:true is returned, otherwisefalse. This member has nomemory_order parameter;Type load() const noexcept:operator Type() const noexcept:void store(Type newValue) noexcept:NewValue is assigned to the current object. Note that the standard assignment operator can also be used.In addition to the above members, integral atomic types `Integral'(essentially the atomic variants of all built-in integral types) also offerthe following member functions:
Integral fetch_add(Integral value) noexcept:Value is added to the object's value, and the object's value at the time of the call is returned;Integral fetch_sub(Integral value) noexcept:Value is subtracted from the object's value, and the object's value at the time of the call is returned;Integral fetch_and(Integral mask) noexcept:bit-and operator is applied to the object's value andmask, assigning the resulting value to the current object. The object's value at the time of the call is returned;Integral fetch_|=(Integral mask) noexcept:bit-or operator is applied to the object's value andmask, assigning the resulting value to the current object. The object's value at the time of the call is returned;Integral fetch_^=(Integral mask) noexcept:bit-xor operator is applied to the object's value andmask, assigning the resulting value to the current object. The object's value at the time of the call is returned;Integral operator++() noexcept:Integral operator++(int) noexcept:Integral operator--() noexceptThe prefix decrement operator, returning object's new value;
Integral operator--(int) noexceptThe postfix decrement operator, returning the object's value before it was decremented;
Integral operator+=(Integral value) noexcept:Value is added to the object's current value and the object's new value is returned;Integral operator-=(Integral value) noexcept:Value is subtracted from the object's current value and the object's new value is returned;Integral operator&=(Integral mask) noexcept:bit-and operator is applied to the object's current value andmask, assigning the resulting value to the current object. The object's new value is returned;Integral operator|=(Integral mask) noexcept:bit-or operator is applied to the object's current value andmask, assigning the resulting value to the current object. The object's new value is returned;Integral operator^=(Integral mask) noexcept:bit-xor operator is applied to the object's current value andmask, assigning the resulting value to the current object. The object's new value is returned;Some of the free member functions have names ending in_explicit. The_explicit functions define an additional parameter `memory_orderorder', which is not available for the non-_explicit functions (e.g.,atomic_load(atomic<Type> *ptr) andatomic_load_explicit(atomic<Type>*ptr, memory_order order))
Here are the free functions that are available for all atomic types:
bool std::atomic_compare_exchange_strong(_explicit)(std::atomic<Type> *ptr, Type *oldValue, Type newValue) noexept:ptr->compare_exchange_strong(*oldValue, newValue);bool std::atomic_compare_exchange_weak(_explicit)(std::atomic<Type> *ptr, Type *oldValue, Type newValue) noexept:ptr->compare_exchange_weak(*oldValue, newValue);Type std::atomic_exchange(_explicit)(std::atomic<Type> *ptr, Type newValue) noexept:ptr->exchange(newValue);void std::atomic_init(std::atomic<Type> *ptr, Type init) noexept:initnon-atomically in*ptr. The object pointed to byptr must have been default constructed, and as yet no member functions must have been called for it. This function has nomemory_order parameter;bool std::atomic_is_lock_free(std::atomic<Type> const *ptr) noexept:ptr->is_lock_free(). This function has nomemory_order parameter;Type std::atomic_load(_explicit)(std::atomic<Type> *ptr) noexept:ptr->load();void std::atomic_store(_explicit)(std::atomic<Type> *ptr, Type value) noexept:ptr->store(value).In addition to the abovementioned free functionsatomic<Integral> typesalso offer the following free member functions:
Integral std::atomic_fetch_add(_explicit)(std::atomic<Integral> *ptr, Integral value) noexcept:ptr->fetch_add(value);Integral std::atomic_fetch_sub(_explicit)(std::atomic<Integral> *ptr, Integral value) noexcept:ptr->fetch_sub(value);Integral std::atomic_fetch_and(_explicit)(std::atomic<Integral> *ptr, Integral mask) noexcept:ptr->fetch_and(value);Integral std::atomic_fetch_or(_explicit)(std::atomic<Integral> *ptr, Integral mask) noexcept:ptr->fetch_or(value);Integral std::atomic_fetch_xor(_explicit)(std::atomic<Integral> *ptr, Integral mask) noexcept:ptr->fetch_xor(mask).n elements, it works like this:partition performing the partition is available). This leaves us with two (possibly empty) sub-arrays: one to the left of the pivot element, and one to the right of the pivot element;To convert this algorithm to a multi-threaded algorithm appears to be be asimple task:
void quicksort(Iterator begin, Iterator end) { if (end - begin < 2) // less than 2 elements are left return; // and we're done Iter pivot = partition(begin, end); // determine an iterator pointing // to the pivot element thread lhs(quicksort, begin, pivot);// start threads on the left-hand // side sub-arrays thread rhs(quicksort, pivot + 1, end); // and on the right-hand side // sub-arrays lhs.join(); rhs.join(); // and we're done }Unfortunately, this translation to a multi-threaded approach won't work for reasonably large arrays because of a phenomenon calledoverpopulation:more threads are started than the operating system is prepared to give us. Inthose cases aResource temporarily unavailable exception is thrown, andthe program ends.
Overpopulation can be avoided by using apool of workers, where each`worker' is a thread, which in this case is responsible for handling one (sub)array, but not for the nested calls. The pool of workers is controlled by ascheduler, receiving the requests to sort sub-arrays, and passing theserequests on to the next available worker.
The main data structure of the example program developed in this section is aqueue ofstd::pairs containing iterators of the array to be sorted(cf. Figure26, the sources of the program are found in theC++ Annotations'syo/threading/examples/multisort directory). Two queues are being used: onequeue is a task-queue, receiving the iterators of sub-arrays to bepartitioned. Instead of immediately launching new threads (thelhs andrhs threads in the above example), the ranges to be sorted are pushed onthe task-queue. The other queue is the work-queue: elements are moved from thetask-queue to the work-queue, where they will be processed by one of theworker threads.

The program'smain function starts the workforce, reads the data, pushesthe arraysbegin andend iterators on the task queue and then startsthe scheduler. Once the scheduler ends the sorted array is displayed:
int main() { workForce(); // start the worker threads readData(); // read the data into vector<int> g_data g_taskQ.push( // prepare the main task Pair(g_data.begin(), g_data.end()) ); scheduler(); // sort g_data display(); // show the sorted elements }The workforce consists of a bunch of detached threads. Each thread representsa worker, implemented in the functionvoid worker. Since the number ofworker threads is fixed, overpopulation doesn't occur. Once the array has beensorted and the program stops these detached threads simply end:
for (size_t idx = 0; idx != g_sizeofWorkforce; ++idx) thread(worker).detach();
The scheduler continues for as long as there are sub-arrays to sort. When thisis the case the task queue's front element is moved to the work queue. Thisreduces the work queue's size, and prepares an assignment for the nextavailable worker. The scheduler now waits until a worker is available. Once workers are available one of them is informed of the waiting assignment, andthe scheduler waits for the next task:
void scheduler() { while (newTask()) { g_workQ.rawPushFront(g_taskQ); g_workforce.wait(); // wait for a worker to be available g_worker.notify_all(); // activate a worker } }The functionnewTask simply checks whether the task queue is empty. If so,and none of the workers is currently busy sorting a sub-array then the arrayhas been sorted, andnewTask can returnfalse. When the task queue isempty but a worker is still busy, it may be that new sub-array dimensions aregoing to be placed on the task queue by an active worker. Whenever a worker isactive theSemaphore g_workforce's size is less than the size of the workforce:
bool wip() { return g_workforce.size() != g_sizeofWorkforce; } bool newTask() { bool done; unique_lock<mutex> lk(g_taskMutex); while ((done = g_taskQ.empty()) && wip()) g_taskCondition.wait(lk); return not done; }Each detached worker thread performs a continuous loop. In the loop it waitsfor a notification by the scheduler. Once it receives a notification itretrieves its assignment from the work queue, and partitions the sub-arrayspecified in its assignment. Partitioning may result in new tasks. Once thishas been completed the worker has completed its assignment: it increments theavailable workforce and notifies the scheduler that it should check whetherall tasks have been performed:
void worker() { while (true) { g_worker.wait(); // wait for action partition(g_workQ.popFront()); g_workforce.notify_all(); lock_guard<mutex> lk(g_taskMutex); g_taskCondition.notify_one(); } }Sub-arrays smaller than two elements need no partitioning. All largersub-arrays are partitioned relative to their first element. Thestd::partition generic algorithm does this well, but if the pivot isitself an element of the array to partition then the pivot's eventual locationis undetermined: it may be found anywhere in the series of elements which areat least equal to the pivot. The two required sub-arrays, however, can easilybe constructed:
std::partition relative to an array's first element, partitioning the array's remaining elements, returningmid, pointing to the first element of the series of elements that are at least as large as the array's first element;mid - 1 points;array.begin() tomid - 1 (elements all smaller than the pivot), and frommid toarray.end() (elements all at least as large as the pivot). void partition(Pair const &range) { if (range.second - range.first < 2) return; auto rhsBegin = partition(range.first + 1, range.second, [=](int value) { return value < *range.first; } ); auto lhsEnd = rhsBegin - 1; swap(*range.first, *lhsEnd); pushTask(range.first, lhsEnd); pushTask(rhsBegin, range.second); }Objects that contain such shared states are calledasynchronous return objects. However,due to the nature of multi threading, a thread may request the results of anasynchronous return object before these result are actually available. Inthose cases the requesting thread blocks, waiting for the results to becomeavailable. Asynchronous return objects offerwait andget memberswhich, respectively,wait until the results have become available, andproduce the asynchronous results once they are available. The phrase thatis used to indicate that the results are available is `the shared state hasbeen made ready'.
Shared states are made ready byasynchronous providers. Asynchronousproviders are simply objects or functions providing results to sharedstates. Making a shared state ready means that an asynchronous provider
wait, to return).Once a shared state has been made ready it contains a value, object, orexception which can be retrieved by objects having access to the sharedstate. While code is waiting for a shared state to become ready the value orexception that is going to be stored in the shared state may be computed. Whenmultiple threads try to access the same shared state they must usesynchronizing mechanisms (like mutexes, cf. section20.2) to preventaccess-conflicts.
Shared states use reference counting to keep track of the number ofasynchronous return objects or asynchronous providers that hold references tothem. These return objects and providers may release their references to theseshared states (which is called `releasing theshared state). This happens when a return object or provider holds the lastreference to the shared state, and the shared state is destroyed.
On the other hand, an asynchronous provider may alsoabandon its shared state. In that case theprovider, in sequence,
std::future_error, holding the error conditionstd::broken_promise in its shared state;Objects of the classstd::future (see the next section) are asynchronousreturn objects. They can be produced by thestd::async (section20.10) family of functions, and by objects of the classesstd::packaged_task (section20.11), andstd::promise (section20.12).
join member.Waiting may be unwelcome: instead of just waiting our thread might also bedoing something useful. It might as well pick up the results produced by asub-thread at some point in the future.
In fact, exchanging data among threads always poses some difficulties, as itrequires shared variables, and the use of locks and mutexes to prevent datacorruption. Rather than waiting and using locks it would be nice if someasynchronous task could be started, allowing the initiating thread (or evenother threads) to pick up the result at some point in the future, when theresults are needed, without having to worry about data locks or waiting times.For situations like theseC++ provides the classstd::future.
Before using the classstd::future the<future> header filemust be included.
Objects of the class templatestd::future harbor the resultsproduced by asynchronously executed tasks. The classstd::future is aclass template. Its template type parameter specifies the type of the resultreturned by the asynchronously executed task. This type may bevoid.
On the other hand, the asynchronously executed task may throw an exception(ending the task). In that case thefuture object catches the exception,and rethrows it once its return value (i.e., the value returned by theasynchronously executed task) is requested.
In this section the members of the class templatefuture aredescribed.Future objects are commonly initialized through anonymousfuture objects returned by the factory functionstd::async or by theget_future members of the classesstd::promise, andstd::packaged_task (introduced in upcoming sections). Examples of the useofstd::future objects are provided in those sections.
Some offuture's members return a value of the strongly typedenumerationstd::future_status. This enumeration definesthree symbolic constants:future_status::ready, future_status::timeout,andfuture_status::deferred.
Error conditions are returned throughstd::future_errorexceptions. These error conditions are represented by the values of thestrongly typed enumerationstd::future_errc (covered in the next section).
The classfuture itself provides the following constructors:
future():future object that does not refer to shared results. Itsvalid member returnsfalse.future(future &&tmp) noexcept:valid member returns whattmp.valid() would haved returned prior to the constructor invocation. After calling the move constructortmp.valid() returnsfalse.future does not offer a copy constructor or an overloadedassignment operator.Here are the members of the classstd::future:
future &operator=(future &&tmp):tmp object; following this,tmp.valid() returnsfalse.std::shared_future<ResultType> share() &&:std::shared_future<ResultType> (see section20.9). After calling this function, thefuture's valid member returnsfalse.ResultType get():wait (see below) is called. Oncewait has returned the results produced by the associated asynchronous task are returned. Withfuture<Type> specifications the returned value is the moved shared value ifType supports move assignment, otherwise a copy is returned. Withfuture<Type &> specifications aType & is returned, withfuture<void> specifications nothing is returned. If the shared value is an exception, it is thrown instead of returned. After calling this member thefuture object'svalid member returnsfalse.bool valid() const:true if the (future) object for whichvalid is called refers to an object returned by an asynchronous task. Ifvalid returnsfalse, thefuture object exists, but in addition tovalid only its destructor and move constructor can safely be called. When other members are called whilevalid returnsfalse astd::future_error exception is thrown (having the valuefuture_errc::no_state).void wait() const:std::future_status wait_for(chrono::duration<Rep, Period> const &rel_time) const:Rep andPeriod from the actually specified duration (cf. section4.2.2). If the results contain a deferred function nothing happens. Otherwisewait_for blocks until the results are available or until the amount of time specified byrel_time has expired. Possible return values are:future_status::deferred if the results contains a deferred function;future_status::ready if the results are available;future_status::timeout if the function is returning because the amount of time specified byrel_time has expired.future_status wait_until(chrono::time_point<Clock, Duration> const &abs_time) const:Clock andDuration from the actually specifiedabs_time (cf. section4.2.4). If the results contain a deferred function nothing happens. Otherwisewait_until blocks until the results are available or until the point in time specified byabs_time has expired. Possible return values are:future_status::deferred if the results contain a deferred function;future_status::ready if the results are available;future_status::timeout if the function is returning because the point in time specified byabs_time has expired.std::future<ResultType> declares the following friends:std::promise<ResultType>
(sf. section20.12), and
template<typename Function, typename... Args> std::future<typename result_of<Function(Args...)>::type> std::async(std::launch, Function &&fun, Args &&...args);
(cf. section20.10).
std::future may return errors by throwingstd::future_error exceptions. These error conditions are represented bythe values of the strongly typed enumerationstd::future_errc which defines the following symbolicconstants:broken_promiseBroken_promiseis thrown when afutureobject was received whose value was never assigned by apromiseorpackaged_task. For example, an object of the classpromise<int>should set the value of thefuture<int>object returned by itsget_futuremember (cf. section20.12), but if it doesn't do so, then abroken_promiseexception is thrown, as illustrated by the following program:1: std::future<int> fun() 2: { 3: return std::promise<int>().get_future(); 4: } 5: 6: int main() 7: try 8: { 9: fun().get();10: }11: catch (std::exception const &exc)12: {13: std::cerr << exc.what() << '\n';14: }At line 3 apromiseobject is created, but its value is never set. Consequently, it `breaks its promise' to produce a value: whenmaintries to retrieve its value (in line 9) astd::futue_errorexception is thrown containing thefuture_errc::broken_promisevalue
future_already_retrievedFuture_already_retrievedis thrown when multiple attempts are made to retrieve thefutureobject from, e.g., apromiseorpackaged_taskobject that (eventually) should be ready. For example:1: int main() 2: { 3: std::promise<int> promise; 4: promise.get_future(); 5: promise.get_future(); 6: }Note that after defining thestd::promiseobject in line 3 it has merely been defined: no value is ever assigned to itsfuture. Even though no value is assigned to thefutureobject, itis a valid object. I.e., after some time the futureshould be ready, and the future'sgetmember should produce a value. Hence, line 4 succeeds, but then, in line 5, the exception is thrown as `the future has already been retrieved'.
promise_already_satisfiedPromise_already_satisfiedis thrown when multiple attempts are made to assign a value to apromiseobject. Assigning a value orexception_ptrto thefutureof apromiseobject may happen only once. For example:1: int main() 2: { 3: std::promise<int> promise; 4: promise.set_value(15); 5: promise.set_value(155); 6: }
no_stateNo_stateis thrown when a member function (other thanvalid, see below) of afutureobject is called when itsvalidmember returnsfalse. This happens, e.g., when calling members of a default constructedfutureobject.No_stateis not thrown forfutureobjects returned by theasyncfactory function or returned by theget_futuremembers ofpromiseorpackaged_tasktype of objects. Here is an example:1: int main() 2: { 3: std::future<int> fut; 4: fut.get(); 5: }
The classstd::future_error is derived from the classstd::exception, and offers, in addition to thechar const *what()const member also the memberstd::error_code const &code() const,returning anstd::error_code object associatedwith the thrown exception.
std::async) thenthe return value of the asynchronously called function becomes available inits activating thread through astd::future object. Thefuture object cannot be used by another thread. If this is required (e.g.,see this chapter's final section) thefuture object must be converted to astd::shared_future object.Before using the classstd::shared_future the<future> header filemust be included.
Once ashared_future object is available, itsget member (see below)can repeatedly be called to retrieve the results of the originalfutureobject. This is illustrated by the next small example:
1: int main() 2: { 3: std::promise<int> promise; 4: promise.set_value(15); 5: 6: auto fut = promise.get_future(); 7: auto shared1 = fut.share(); 8: 9: std::cerr << "Result: " << shared1.get() << "\n" 10: "Result: " << shared1.get() << "\n" 11: "Valid: " << fut.valid() << '\n'; 12: 13: auto shared2 = fut.share(); 14: 15: std::cerr << "Result: " << shared2.get() << "\n" 16: "Result: " << shared2.get() << '\n'; 17: } In lines 9 and 10 thepromise's results are retrieved multiple times,but having obtained theshared_future in line 7, the originalfutureobject no longer has an associated shared state. Therefore, when anotherattempt is made (in line 13) to obtain theshared_future, anoassociated state exception is thrown and the program aborts.However, multiple copies ofshared_future objects may co-exist. Whenmultiple copies ofshared_future objects exist (e.g. in differentthreads), the results of the associated asynchronous task are made ready(become available) at exactly the same moment in time.
The relationship between the classesfuture andshared_futureresembles the relationship between the classesunique_ptr andshared_ptr: there can only be one instance of aunique_pointer,pointing to data, whereas there can be many instances of ashared_pointer,each pointing to the same data.
The effect of calling any member of ashared_future object for whichvalid() == false other than the destructor, the move-assignment operator,orvalid is undefined.
The classshared_future supports the following constructors:
shared_future() noexceptan emptyshared_futureobject is constructed that does not refer to shared results. After using this constructor the object'svalidmember returnsfalse.
shared_future(shared_future const &other)ashared_futureobject is constructed that refers to the same results asother(if any). After using this constructor the object'svalidmember returns the same value asother.valid().
shared_future(shared_future<Result> &&tmp) noexceptEffects: move constructs a shared_future object that refers to the results that were originally referred to bytmp(if any). After using this constructor the object'svalidmember returns the same value astmp.valid()returned prior to the constructor invocation, andtmp.valid()returnsfalse.
shared_future(future<Result> &&tmp) noexceptEffects: move constructs a shared_future object that refers to the results that were originally referred to bytmp(if any). After using this constructor the object'svalidmember returns the same value astmp.valid()returned prior to the constructor invocation, andtmp.valid()returnsfalse.
The class's destructor destroys theshared_future object for which it iscalled. If the object for which the destructor is called is the lastshared_future object, and nostd::promise orstd::packaged_task is associated with the results associatedwith the current object, then the results are also destroyed.
Here are the members of the classstd::shared_future:
shared_future& operator=(shared_future &&tmp):tmp's results to the current object. After calling the move assignment operator the current object'svalid member returns the same value astmp.valid() returned prior to the invocation of the move assignment operator, andtmp.valid() returnsfalse;shared_future& operator=(shared_future const &rhs):rhs's results are shared with the current object. After calling the assignment operator the current object'svalid member returns the same value astmp.valid();Result const &shared_future::get() const:shared_future<Result &> andshared_future<void> are also available). This member waits until the shared results are available, and subsequently returnsResult const &. Note that access to the data stored inResults, accessed throughget is not synchronized. It is the responsibility of the programmer to avoid race conditions when accessingResult's data. IfResult holds an exception, it is thrown whenget is called;bool valid() const:true if the current object refers to shared results;void wait() const:future_status wait_for(const chrono::duration<Rep, Period>& rel_time) const:Rep andPeriod normally are derived by the compiler from the actualrel_time specification.) If the shared results contain a deferred function (cf. section20.10) nothing happens. Otherwisewait_for blocks until the results of the associated asynchronous task has produced results, or until the relative time specified byrel_time has expired. The member returnsfuture_status::deferred if the shared results contain a deferred function;future_status::ready if the shared results are available;future_status::timeout if the function is returning because the amount of time specified byrel_time has expired;future_status wait_until(const chrono::time_point<Clock, Duration>& abs_time) const:Clock andDuration normally are derived by the compiler from the actualabs_time specification.) If the shared results contain a deferred function nothing happens. Otherwisewait_until blocks until the shared results are available or until the point in time specified byabs_time has expired. Possible return values are:future_status::deferred if the shared results contain a deferred function;future_status::ready if the shared results are available;future_status::timeout if the function is returning because the point in time specified byabs_time has expired.std::async iscovered.Async is used to start asynchronous tasks, returning values (orvoid) to the calling thread, which is hard to realize merely using thestd::thread class.Before using the functionasync the<future> header file must beincluded.
When starting a thread using the facilities of the classstd::thread theinitiating thread at some point commonly calls the thread'sjoinmethod. At that point the thread must have finished or execution blocks untiljoin returns. While this often is a sensible course of action, it may notalways be: maybe the function implementing the thread has a return value, orit could throw an exception.
In those casesjoin cannot be used: if an exception leaves a thread, thenyour program ends. Here is an example:
1: void thrower() 2: { 3: throw std::exception(); 4: } 5: 6: int main() 7: try 8: { 9: std::thread subThread(thrower); 10: } 11: catch (...) 12: { 13: std::cerr << "Caught exception\n"; 14: } In line 3thrower throws an exception, leaving the thread. Thisexception is not caught bymain's try-block (as it is defined in anotherthread). As a consequence, the program terminates.This scenario doesn't occur whenstd::async is used.Async may start anew asynchronous task, and the activating thread may retrieve the return valueof the function implementing the asynchronous task or any exception leavingthat function from astd::future object returned by theasyncfunction. Basically,async is called similarly to the way a thread isstarted usingstd::thread: it is passed a function and optionallyarguments which are forwarded to the function.
Although the function implementing the asynchronous task may be passed asfirst argument,async's first argument may also be a value of the stronglytyped enumerationstd::launch:
enum class launch { async, deferred };When passinglaunch::async the asynchronous task immediately starts;when passinglaunch::deferred the asynchronous task is deferred. Whenstd::launch is not specified the default valuelaunch::async |launch::deferred is used, giving the implementation freedom of choice, usually resulting in deferring execution of the asynchronous task.
So, here is the first example again, this time usingasync to start thesub-thread:
1: bool fun() 2: { 3: return std::cerr << " hello from fun\n"; 4: } 5: int exceptionalFun() 6: { 7: throw std::exception(); 8: } 9: 10: int main() 11: try 12: { 13: auto fut1 = std::async(std::launch::async, fun); 14: auto fut2 = std::async(std::launch::async, exceptionalFun); 15: 16: std::cerr << "fun returned " << std::boolalpha << fut1.get() << '\n'; 17: std::cerr << "exceptionalFun did not return " << fut2.get() << '\n'; 18: } 19: catch (...) 20: { 21: std::cerr << "caught exception thrown by exceptionalFun\n"; 22: } Now the threads immediately start, but although the results areavailable around line 13, the thrown exception isn't terminating theprogram. The first thread's return value is made available in line 16, theexception thrown by the second thread is simply caught by main's try-block(line 19).The function templateasync has several overloaded versions:
std::future holding the function's return value or exceptionthrown by the function:template <typename Function, class ...Args> std::future< typename std::result_of< Function(Args ...) >::type > std::async(Function &&fun, Args &&...args);
bit_oroperator) of the enumeration values of thestd::launch enumeration:template <class Function, class ...Args> std::future<typename std::result_of<Function(Args ...)>::type> std::async(std::launch policy, Function &&fun, Args &&...args);
std::launch values, the secondargument may also be the address of a member function. In that case the(required) third argument is an object (or a pointer to an object) of thatmember function's class. Any remaining arguments are passed to the memberfunction (see also the remarks below).async all arguments except for thestd::launchargument must be references, pointers or move-constructible objects:async function template then copy construction is used to construct a copy of the argument which is then forwarded to the thread-launcher.async function template then move construction is used to forward the anonymous object to the thread launcher.get member iscalled).Because of the defaultstd::launch::deferred | std::launch::async argumentused by the basicasync call it is likely that the function which ispassed toasync doesn't immediately start. Thelaunch::deferred policyallows the implementor to defer its execution until the program explicitlyasks for the function's results. Consider the following program:
1: void fun() 2: { 3: std::cerr << " hello from fun\n"; 4: } 5: 6: std::future<void> asyncCall(char const *label) 7: { 8: std::cerr << label << " async call starts\n"; 9: auto ret = std::async(fun); 10: std::cerr << label << " async call ends\n"; 11: return ret; 12: } 13: 14: int main() 15: { 16: asyncCall("First"); 17: asyncCall("Second"); 18: } Althoughasync is called in line 9, the program's output may not showfun's output line when it is run. This is a result of the (default) useoflauch::deferred: the system simply defersfun's execution untilrequested, which doesn't happen. But thefuture object that's returned byasync has a memberwait. Oncewait returns the shared state mustbe available. In other words:fun must have finished. Here is what happenswhen after line 9 the lineret.wait() is inserted:First async call starts hello from fun First async call ends Second async call starts hello from fun Second async call ends
Actually, evaluation offun can be requested at the point where weneed its results, maybe even after callingasyncCall, as shown in the nextexample:
1: int main() 2: { 3: auto ret1 = asyncCall("First"); 4: auto ret2 = asyncCall("Second"); 5: 6: ret1.get(); 7: ret2.get(); 8: } Here theret1 andret2 std::future objects are created, but theirfun functions aren't evaluated yet. Evaluation occurs at lines 6 and 7,resulting in the following output:First async call starts First async call ends Second async call starts Second async call ends hello from fun hello from fun
Thestd::async function template is used to start a thread, making itsresults available to the calling thread. On the other hand, we may only beable toprepare (package) a task (a thread), but may have to leave thecompletion of the task to another thread. Scenarios like this are realizedthrough objects of the classstd::packaged_task, which is the topic of thenext section.
std::packaged_task allows a program to`package' a function or functor and pass the package to a thread for furtherprocessing. The processing thread then calls the packaged function, passing itits arguments (if any). After completing the function thepackaged_task'sfuture is ready, allowing the program to retrieve the results produced bythe function. Thus, functions and the results of function calls can betransferred between threads.Before using the class templatepackaged_task the<future> header filemust be included.
Before describing the class's interface, let's first look at an example to getan idea about how apackaged_task can be used. Remember that the essenceofpackaged_task is that part of your program prepares (packages) a taskfor another thread to complete, and that the program at some point needs theresult of the completed task.
To clarify what's happening here, let's first look at a real-lifeanalogon. Every now and then I make an appointment with my garage to have mycar serviced. The `package' in this case are the details about my car: itsmake and type determine the kind of actions my garage performs when servicingit. My neighbor also has a car, which also needs to be serviced every now andthen. This also results in a `package' for the garage. At the appropriate timeme and my neighbor take our cars to the garage (i.e., the packages are passedto another thread). The garage services the cars (i.e., calls the functionsstored in thepackaged_tasks [note that the tasks differ, depending on thetypes of the cars]), and performs some actions that are associated with it(e.g., registering that my or my neighbor's car has been serviced, or orderreplacement parts). In the meantime my neighbor and I perform our ownbusinesses (the program continues while a separate thread runs as well). Butby the end of the day we'd like to use our cars again (i.e., get the resultsassociated with thepackaged_task). A common result in this example is thegarage's bill, which we have to pay (the program obtains thepackaged_task's results).
Here is a littleC++ program illustrating the use of apackaged_task(assuming the required headers andusing namespace std have beenspecified):
1: mutex carDetailsMutex; 2: condition_variable condition; 3: string carDetails; 4: packaged_task<size_t (std::string const &)> serviceTask; 5: 6: size_t volkswagen(string const &type) 7: { 8: cout << "performing maintenance by the book for a " << type << '\n'; 9: return type.size() * 75; // the size of the bill 10: } 11: 12: size_t peugeot(string const &type) 13: { 14: cout << "performing quick and dirty maintenance for a " << type << '\n'; 15: return type.size() * 50; // the size of the bill 16: } 17: 18: void garage() 19: { 20: while (true) 21: { 22: unique_lock<mutex> lk(carDetailsMutex); 23: while (carDetails.empty()) 24: condition.wait(lk); 25: 26: cout << "servicing a " << carDetails << '\n'; 27: serviceTask(carDetails); 28: carDetails.clear(); 29: } 30: } 31: 32: int main() 33: { 34: thread(garage).detach(); 35: 36: while (true) 37: { 38: string car; 39: if (not getline(cin, car) || car.empty()) 40: break; 41: { 42: lock_guard<mutex> lk(carDetailsMutex); 43: carDetails = car; 44: } 45: serviceTask = packaged_task<size_t (string const &)>( 46: car[0] == 'v' ? volkswagen : peugeot 47: ); 48: auto bill = serviceTask.get_future(); 49: condition.notify_one(); 50: cout << "Bill for servicing a " << car << 51: ": EUR " << bill.get() << '\n'; 52: } 53: }packaged_task: serviceTask is initialized with a function (or functor) expecting astring, returning asize_t;volkswagen andpeugeot represent the tasks to perform when cars of the provided types come in for service; presumably they return the bill.void garage, defining the actions performed by the garage when cars come in for service. These actions are performed by a separate detached thread, starting in line 34. In a continuous loop it waits until it obtains a lock on thecarDetailsMutex andcarDetails is no longer empty. Then, at line 27, it passescarDetails to thepackaged_task `serviceTask'. By itself this is not identical to calling thepackaged_task's function, but eventually its function will be called. At this point thepackaged_task receives its function's arguments, which it eventually will forward to its configured function. Finally, at line 28 it clearscarDetails, thus preparing itself for the next request.main:garage is started.volkswagen orpeugeot), and thepackaged_task, provided with the right servicing function, is constructed next (line 45).future, are retrieved. Although at this point thefuture might not be ready, thefuture object itselfis, and it is simply returned as the bill.bill.get() in line 51. If, by this time, the car is still being serviced, the bill isn't ready yet, andbill.get() blocks until it is, and the bill for servicing a car is shown.packaged_task,let's have a look at its interface. Note that the classpackaged_task is aclass template: its template type parameter specifies the prototype of afunction or function object implementing the task performed by thepackaged_task object.Constructors and destructor:
packaged_task() noexcept:packaged_task object which is not associated with a function or shared state;explicit packaged_task<ReturnType(Args...)> task(fun):packaged_task is constructed for a function or functorfun expecting arguments of typesArgs..., and returning a value of typeReturnType. Thepackaged_task class template specifiesReturnType (Args...) as its template type parameter. The constructed object contains a shared state, and a (move constructed) copy offunction.Optionally anAllocator may be specified as second template type parameter, in which case the first two arguments arestd::allocator_arg_t, Allocator const &alloc. The typestd::allocator_arg_t is a type introduced to disambiguate constructor selections, and can simply be specified asstd::allocator_arg_t().
This constructor may throw astd::bad_alloc exception or exceptions thrown byfunction's copy or move constructors.
packaged_task(packaged_task &&tmp) noexcept:tmp to the newly constructed object, removing the shared state fromtmp.~packaged_task():Member functions:
future<ReturnType> get_future():std::future object is returned holding the results of the separately executed thread. Whenget_future is incorrectly called afuture_error exception is thrown, containing one of the following values:future_already_retrieved ifget_future was already called on apackaged_task object containing the same shared state as the current object;no_state if the current object has no shared state.futures that share the object's shared state may access the result returned by the object's task.void make_ready_at_thread_exit(Args... args):void operator()(Args... args) (see below) when the current thread exits, once all objects of thread storage duration associated with the current thread have been destroyed.packaged_task &operator=(packaged_task &&tmp):tmp are swapped;void operator()(Args... args):args arguments are forwarded to the current object's stored task. When the stored task returns its return value is stored in the current object's shared state. Otherwise any exception thrown by the task is stored in the object's shared state. Following this the object's shared state is made ready, and any threads blocked in a function waiting for the object's shared state to become ready are unblocked. Afuture_error exception is thrown upon error, containingpromise_already_satisfied if the shared state has already been made ready;no_state if the current object does not have any shared state.(shared_)future object that provides access to thepackaged_task's results.void reset():packaged_task(std::move(funct)), wherefunct is the object's stored task. This member may throw the following exceptions:bad_alloc if memory for the new shared state could not be allocated;future_error with ano_state error condition if the current object contains no shared state.void swap(packaged_task &other) noexcept:bool valid() const noexcept:true if the current object contains a shared state, otherwisefalse is returned;The following non-member (free) function operating onpackaged_taskobjects is available:
void swap(packaged_task<ReturnType(Args...)> &lhs,packaged_task<ReturnType(Args...)> &rhs) noexceptCallslhs.swap(rhs)std::packaged_task andstd::async the class templatestd::promise can be used to obtain the results from aseparate thread.Before using the class templatepromise the<future> header filemust be included.
Apromise is used to obtain the results from another thread withoutfurther synchronization requirements. Consider the following program:
void compute(int *ret) { *ret = 9; } int main() { int ret = 0; std::thread(compute, &ret).detach(); cout << ret << '\n'; } Chances are that this program shows the value 0: thecout statement hasalready been executed before the detached thread has had a chance to completeits work. In this example that problem can easily be solved by using anon-detached thread, and using the thread'sjoin member, but when multiplethreads are used that requires named threads and as manyjoincalls. Instead, using apromise might be preferred: 1: void compute(promise<int> &ref) 2: { 3: ref.set_value(9); 4: } 5: 6: int main() 7: { 8: std::promise<int> prom; 9: std::thread(compute, ref(prom)).detach(); 10: 11: cout << prom.get_future().get() << '\n'; 12: } This example also uses a detached thread, but its results are kept forfuture reference in apromise object, instead of directly being assignedto a final destination variable. Thepromise object contains afutureobject holding the computed value. Thefuture's get member blocks untilthe future has been made ready, at which point the result becomesavailable. By then the detached thread may or may not yet have beencompleted. If it already completed its work thenget immediately returns,otherwise there will be a slight delay.Promises are useful when implementing a multi threaded version of somealgorithm without having to use additional synchronization statements. As anexample consider matrix multiplications. Each element of the resultingproduct matrix is computed as the inner product of two vectors: the innerproduct of a row of the left-hand matrix operand and a column of theright-hand matrix operand becomes element[row][column] of the resultingmatrix. Since each element of the resulting matrix can independently becomputed from the other elements, a multi threaded implementation is wellpossible. In the following example the functioninnerProduct (lines 4..11)leaves its result in apromise object:
1: int m1[2][2] = {{1, 2}, {3, 4}}; 2: int m2[2][2] = {{3, 4}, {5, 6}}; 3: 4: void innerProduct(promise<int> &ref, int row, int col) 5: { 6: int sum = 0; 7: for (int idx = 0; idx != 2; ++idx) 8: sum += m1[row][idx] * m2[idx][col]; 9: 10: ref.set_value(sum); 11: } 12: 13: int main() 14: { 15: promise<int> result[2][2]; 16: 17: for (int row = 0; row != 2; ++row) 18: { 19: for (int col = 0; col != 2; ++col) 20: thread(innerProduct, ref(result[row][col]), row, col).detach(); 21: } 22: 23: for (int row = 0; row != 2; ++row) 24: { 25: for (int col = 0; col != 2; ++col) 26: cout << setw(3) << result[row][col].get_future().get(); 27: cout << '\n'; 28: } 29: } Each inner product is computed by a separate (anonymous and detached)thread (lines 17..21), which starts as soon as the run-time system allows itto start. By the time the threads have finished the resulting inner productscan be retrieved from the promises' futures. Since futures'get membersblock until their results are actually available, the resulting matrix cansimply be displayed by calling those members in sequence (lines 23..28).So, apromise allows us to use a thread to compute a value (orexception, see below), which value may then be collected by another thread atsome future point in time. The promise remains available, and as a consequencefurther synchronization of the threads and the program starting the threads isnot necessary. When the promise object contains an exception, rather than avalue, its future'sget member rethrows the stored exception.
Here is the classpromise's interface. Note that the classpromise isa class template: its template type parameterReturnType specifies thetemplate type parameter of thestd::future that can be retrieved from it.
Constructors and destructor:
promise():promise object containing a shared state. The shared state may be returned by the memberget_future (see below), but that future has not yet been made ready;promise(promise &&tmp) noexcept:promise object, transferring the ownership oftmp's shared state to the newly constructed object. After the object has been constructed,tmp no longer contains a shared state;~promise():Member functions:
std::future<ReturnType> get_future():std::future object sharing the current object's shared state is returned. Afuture_error exception is thrown upon error, containingfuture_already_retrieved ifget_future was already called on apackaged_task object containing the same shared state as the current object;no_state if the current object has no shared state.futures that share the object's shared state may access the result returned by the object's task;promise &operator=(promise &&rhs) noexcept:tmp are swapped;void promise<void>::set_value():set_value member's description;void set_value(ReturnType &&value):set_value member's description;void set_value(ReturnType const &value):void set_value(ReturnType &value):value) is atomically stored in the shared state, which is then also made ready. Afuture_error exception is thrown upon error, containingpromise_already_satisfied if the shared state has already been made ready;no_state if the current object does not have any shared state.value's move or copy constructor may be thrown;void set_exception(std::exception_ptr obj):Exception_ptr obj (cf. section10.9.4) is atomically stored in the shared state, making that state ready. Afuture_error exception is thrown upon error, containingpromise_already_satisfied if the shared state has already been made ready;no_state if the current object does not have any shared state;void set_exception_at_thread_exit(exception_ptr ptr):ptr is stored in the shared state without immediately making that state ready. The state becomes ready when the current thread exits, once all objects of thread storage duration which are associated with the ending thread have been destroyed. Afuture_error exception is thrown upon error, containingpromise_already_satisfied if the shared state has already been made ready;no_state if the current object does not have any shared state;void set_value_at_thread_exit():set_value_at_thread_exit member's description;void set_value_at_thread_exit(ReturnType &&value):set_value_at_thread_exit member's description;void set_value_at_thread_exit(ReturnType const &value):set_value_at_thread_exit member's description;void set_value_at_thread_exit(ReturnType &value):value in the shared state without immediately making that state ready. The state becomes ready when the current thread exits, once all objects of thread storage duration which are associated with the ending thread have been destroyed. Afuture_error exception is thrown upon error, containingpromise_already_satisfied if the shared state has already been made ready;no_state if the current object does not have any shared state;void swap(promise& other) noexcept:other are exchanged.The following non-member (free) function operating onpromise objects isavailable:
void swap(promise<ReturnType> &lhs, promise<ReturnType> &rhs) noexcept:lhs.swap(rhs)packaged_tasks.Like the multi-threaded quicksort example a worker pool is used. However, inthis example the workers in fact do not know what their task is. In thecurrent example the tasks happens to be identical, but different tasks mightas well have been used, without having to update the workers.
The program uses aclass Task containing a command-specification(d_command), and a task specification (d_task) (cf. Figure27), thesources of the program are found in theyo/threading/examples/multicompile directory of theC++ Annotations.

In this programmain starts by firing up its workforce in a series ofthreads. Following this, the compilation jobs are prepared and pushed on atask-queue byjobs, where they're retrieved from by the workers. Oncethe compilations have been completed (i.e., after the worker threads havejoined the main thread), the results of the compilation jobs are handled byresults:
int main() { workforce(); // start the worker threads jobs(); // prepare the jobs: push all tasks on the // taskQ for (thread &thr: g_thread) // wait for the workers to end thr.join(); results(); // show the results }Thejobs function receives the names of the files to compile from thenextCommand function, which ignores empty lines and returns non-emptylines. EventuallynextCommand returns an empty line once all lines of thestandard input stream have been read:
string nextCommand() { string ret; while (true) { if (not getline(cin, ret)) // no more lines break; if (not ret.empty()) // ready once there's line content. break; } return ret; }With non-empty linesjobs waits for an available worker using (line 12)theg_dispatcher semaphore. Initialized to the size of the work force, itis reduced by an active worker, and incremented by workers who have completedtheir tasks. If a compilation fails, theng_done is set totrue and noadditional compilations are performed (lines 14, 15). Whilejobs receivesthe names of the files to compile, workers may detect compilation errors. Ifso, the workers set variableg_done totrue. Once thejobfunction'swhile loop ends the workers are notified once again (line 24),who will then, because there's no task to perform anymore, end their threads
1: void jobs() 2: { 3: while (true) 4: { 5: string line = nextCommand(); 6: if (line.empty()) // no command? jobs() done. 7: { 8: g_done = true; 9: break; 10: } 11: 12: g_dispatcher.wait(); // wait for an available worker 13: 14: if (g_done.load()) // if a worker found an error 15: break; // then quit anyway 16: 17: newTask(line); // push a new task (and its 18: // results) 19: 20: g_worker.notify_all(); // inform the workers: job is 21: // available 22: } 23: 24: g_worker.notify_all(); // end the workers at an empty Q 25: }The functionnewTask prepares the program for the next task. First aTask object is constructed.Task contains the name of the file tocompile, and apackaged_task. It encapsulates all activities that areassociated with apackaged_task. Here is its (in-class) definition:
1: using PackagedTask = packaged_task<Result (string const &fname)>; 2: 3: class Task 4: { 5: string d_command; 6: PackagedTask d_task; 7: 8: public: 9: Task() = default; 10: 11: Task(string const &command, PackagedTask &&tmp) 12: : 13: d_command(command), 14: d_task(move(tmp)) 15: {} 16: 17: void operator()() 18: { 19: d_task(d_command); 20: } 21: 22: shared_future<Result> result() 23: { 24: return d_task.get_future().share(); 25: } 26: };Note (lines 22-25) thatresult returns ashared_future. Since thedispatcher runs in a different thread than the one processing the results, thefutures created by the dispatcher must be shared with thefuturesrequired by the function processing the results. Hence theshared_futuresreturned byTask::result.
Once aTask object has been constructed itsshared_future object ispushed on the result queue. Although the actual results aren't available bythis time, theresult function is eventually called to process the resultsthat were pushed on the result-queue. Additionally, theTask itself ispushed on atask queue, and it will be retrieved by a worker:
class Task { string d_command; PackagedTask d_task; public: Task() = default; Task(string const &command, PackagedTask &&tmp); void operator()(); shared_future<Result> result(); }; void pushResultQ(shared_future<Result> const &sharedResult) { lock_guard<mutex> lk(g_resultQMutex); g_resultQ.push(sharedResult); }The workers have a simple task: wait for the next task, then retrieve it fromthe task queue, and complete that task. Whatever happens inside the tasksthemselves is of no concern to the worker. Also, when notified (normally bythejobs function) that there's a task waiting it'll execute thattask. However, at the end, once all tasks have been pushed on the task queue,jobs once again notifies the workers. In that case the task queue isempty, and the worker function ends. But just before that it notifies itsfellow workers, which in turn end, thus ending all worker threads, allowingthem to join themain-thread:
void worker() { Task task; while (true) { g_worker.wait(); // wait for an available task if (g_taskQ.empty()) // no task? then done break; g_taskQ.popFront(task); g_dispatcher.notify_all(); // notify the dispatcher that // another task can be pushed task(); } g_worker.notify_all(); // no more tasks: notify the other } // workers.This completes the description of how tasks are handled.
The task itself are now described. In the current programC++ source filesare compiled. The compilation command is passed to the constructor of aCmdFork object, which starts the compiler as a child process. The resultof the compilation is retrieved via itschildExit member (returning thecompiler's exit code) andchildOutput member (returning any textual outputproduced by the compiler). If compilation fails, the exit value won't bezero. In this case no further compilation tasks will be issued asg_doneis set totrue (lines 11 and 12; the implementation of theclassCmdFork is available from theC++ Annotations'yo/threading/examples/cmdfork directory). Here is the functioncompile:
1: Result compile(string const &line) 2: { 3: string command("/usr/bin/g++ -Wall -c " + line); 4: 5: CmdFork cmdFork(command); 6: cmdFork.fork(); 7: 8: Result ret {cmdFork.childExit() == 0, 9: line + "\n" + cmdFork.childOutput()}; 10: 11: if (not ret.ok) 12: g_done = true; 13: 14: return ret; 15: }Theresults function continues for as long asnewResults indicatesthat results are available. By design the program will show all availablesuccessfully completed compilations, and (if several workers encounteredcompilation errors) only the compiler's output of the first compilation error is displayed. Allavailable successfully completedcompilations meaning that, in case of a compilation error, the source filesthat were successfully compiled by the currently active work force are listed,but remaining source files are not processed anymore:
void results() { Result result; string errorDisplay; while (newResult(result)) { if (result.ok) cerr << result.display; else if (errorDisplay.empty()) errorDisplay = result.display; // remember the output of } // the first compilation error if (not errorDisplay.empty()) // which is displayed at the end cerr << errorDisplay; }The functionnewResult controlsresults' while-loop. It returnstrue when as long as the result queue isn't empty, in which case thequeue's front element is stored at the externalResult object, and thequeue's front element is removed from the queue:
bool newResult(Result &result) { if (g_resultQ.empty()) return false; result = g_resultQ.front().get(); g_resultQ.pop(); return true; }Transactional memory is used to simplify shared data access in multithreadedprograms. The benefits of transactional memory is best illustrated by asmall program. Consider a situation where threads need to write information toa file. A plain example of such a program would be:
void fun(int value) { for (size_t rept = 0; rept != 10; ++rept) { this_thread::sleep_for(chrono::seconds(1)); cout << "fun " << value << '\n'; } } int main() { thread thr{ fun, 1 }; fun(2); thr.join(); }When this program is run thefun 1 andfun 2 messages areintermixed. To prevent this we traditionally define amutex, lock it,write the message, and release the lock:
void fun(int value) { static mutex guard; for (size_t rept = 0; rept != 10; ++rept) { this_thread::sleep_for(chrono::seconds(1)); guard.lock(); cout << "fun " << value << '\n'; guard.unlock(); } };Transactional memory handles the locking for us. Transactional memory is usedwhen statements are embedded in asynchronized block. The functionfun, using transactional memory, looks like this:
void fun(int value) { for (size_t rept = 0; rept != 10; ++rept) { this_thread::sleep_for(chrono::seconds(1)); synchronized { cout << "fun " << value << '\n'; } } }; To compile source files using transactional memory theg++ compileroption-fgnu-tm must be specified.
The code inside a synchronized block is executed as a single, as if theblock was protected by a mutex. Different from using mutexes transactionalmemory is implemented in software instead of using hardware-facilities.
Considering how easy it is to use transactional memory compared to usingthemutex-based locking mechanism using transactional memory appearstoo good to be true. And in a sense it is. When encountering a synchronizedblock the thread unconditionally executes the block's statements. At the sametime it keeps a detailed log of all its actions. Once the statements have beencompleted the thread checks whether another thread didn't start executing theblock just before it. If so, it reverses its actions, using the synchronizedblock's log. The implication of this should be clear: there's at least theoverhead of maintaining the log, andif another thread started executing the synchronized block before the current thread then there's the additionaloverhead of reverting its actions and to try again.
The advantages of transactional memory should also be clear: theprogrammers no longer is responsible for correctly controlling access toshared memory; risks of encountering deadlocks have disappeared as has alladminstrative overhead of defining mutexes, locking and unlocking. Especiallyfor inherently slow operations like writing to files transactional memory cangreatly simplify parts of your code. Consider astd::stack. Itstop-element can be inspected but itspop member does not return thetopmost element. To retrieve the top element and then maybe remove ittraditionally requires a mutex lock surrounding determining the stack's size,and if empty, release the lock and wait. If not empty then retrieve itstopmost element, followed by removing it from the stack. Using a transactionalmemory we get something as simple as:
bool retrieve(stack<Item> &itemStack, Item &item) { synchronized { if (itemStack.empty()) return false; item = std::move(itemStack.top()); itemStack.pop(); return true; } }Variants ofsynchronized are:
atomic_noexcept: the statements inside its compound statement may not throw exceptions. If they do,std::abort is called. If the earlierfun function specifiesatomic_noexcept instead ofsynchronized the compiler generates and error about the use of the insertion operator, from which an exception may be thrown.atomic_cancel: not yet supported byg++. If an exception other than (std::)bad_alloc, bad_array_new_length, bad_cast, bad_typeid, bad_exception, exception, tx_exception<Type> is thrownstd::abort is called. If an acceptable exception is thrown, then the statements executed so far are undone.atomic_commit: if an exception is thrown from its compound statement all thus far executed statements are kept (i.e., not undone).Recently the classstd::osyncstream was added to the language, allowingmulti threaded programs allowing threads to write information block-wise to acommon stream without having to define separate streams receiving thethread-specific information, eventually copying those streams to thedestination stream. Before usingosyncstream objects the<syncstream>header file must be included.
Theosyncstream class publicly inherits fromstd::ostream,initializing theostream base class with astd::syncbuf stream buffer(described in the next section), which performs the actual synchronization.
Information written toosyncstream objects can explicitly be copied to adestinationostream, or is automatically copied to the destinationostream by theosyncstream's destructor. Each thread may construct itsownosyncstream object, handling the block-wise copying of theinformation it receives to the destination stream.
Constructors
osyncstream{ostream &out} constructs anosyncstream object eventually writing the information it receives toout. Below,out is called thedestination stream;osyncstream{osyncstream &&tmp} the move constructor is available;The default- and copy-constructors are not available.
Member functions
In addition to the members inherited fromstd::ostream (like therdbuf member returing a pointer to the object'ssyncbuf (described inthe next section)) the classosyncstream offers these members:
get_wrapped, returning a pointer to the destination stream's stream buffer;emit, copies the received information as a block to the destination stream.The following program illustrates howosyncstream objects can be used.
1: #include <iostream> 2: #include <syncstream> 3: #include <string> 4: #include <thread> 5: 6: using namespace std; 7: 8: void fun(char const *label, size_t count) 9: {10: osyncstream out(cout);11: 12: for (size_t idx = 0; idx != count; ++idx)13: {14: this_thread::sleep_for(1s);15: out << label << ": " << idx << " running...\n";16: }17: out << label << " ends\n";18: }19: 20: int main(int argc, char **argv)21: {22: cout << "the 1st arg specifies the #iterators "23: "using 3 iterations by default\n";24: 25: size_t count = argc > 1 ? stoul(argv[1]) : 3;26: 27: thread thr1{ fun, "first", count };28: thread thr2{ fun, "second", count };29: 30: thr1.join();31: thr2.join();32: }fun (line 8) is called bymain from two threads (lines 27, 28);osyncstream out and, using short one-second pauses, writes some lines of text toout (lines 14, 15);fun the localout content is written as a block tocout (line 18). Writingout's content tocout can also explicitly be requested by callingout.emit().osyncstream stream in fact is only a wrapper ofostream, using asyncbuf as its stream buffer. Thestd::syncbuf handles the actualsynchronization. In order to use thesyncbuf stream buffer the<syncstream> header file must be included.Asyncbuf stream buffer collects the information it receives from anostream in an internal buffer, and its destructor andemit member flush its buffer as a block to its destination stream.
Constructors
syncbuf(), the default constructor, constructs asyncbuf object with itsemit-on-sync policy (see below) set tofalse;explicit syncbuf(streambuf *destbuf) constructs astd::syncbuf with its emit-on-sync policy set tofalse, usingdestbuf as the destination stream'sstreambuf;syncbuf(syncbuf &&rhs), the move constructor, moves the content ofrhs to the constructedsyncbuf.Member functions
In addition to the members inherited fromstd::streambuf the classsyncbuf offers these members:
get_wrapped, returning a pointer to the destination stream's stream buffer;emit, copies the received information as a block to the destination stream;void set_emit_on_sync(bool how) changes the current emit-on-sync policy. By defaulthow == false flushing its internal buffer to the destination's stream buffer. Whenhow == true the internal buffer is always immediately flushed;results processes the queued results by showing thenames of the successfully compiled source files, and (if a compilation failed)the name and error messages of the first source whose compilation failed.The results-queue was used to store the results in a retrievable datastructure, using a mutex to ensure that the workers cannot simultaneouslypush results on the results-queue.
Usingosyncstream objects the results-queue and its mutexed protectionscheme is no longer required (the sources of the modified program areavailable in theC++ Annotations' directoryyo/threading/examples/osyncmulticompile).
Instead of using a results-queue the program uses a single destination streamfstream g_out{ "/tmp/out", ios::trunc | ios::in | ios::out }, and itscompile function defines a local aosyncstream object, ensuring thatits output is sent as a block tog_out:
1: void compile(string const &line) 2: { 3: if (g_done.load()) 4: return; 5: 6: string command("/usr/bin/g++ -Wall -c " + line); 7: 8: CmdFork cmdFork(command); 9: cmdFork.fork(); 10: 11: int exitValue = cmdFork.childExit(); 12: 13: osyncstream out(g_out); 14: out << exitValue << ' ' << line << '\n'; 15: 16: if (exitValue != 0) 17: { 18: out << cmdFork.childOutput() << '\n' << g_marker << '\n'; 19: g_done = true; 20: } 21: // out.emit(); // handled by out's destructor 22: }osyncstream out object is defined, and the results of the compilation are written toout at lines 14 and 18;out;out terminated by a marker, used byresults (see below), to recognize the end of the error messages.Since the results of the compilation are no longer transferred to anotherthread, there's no need for defining ashared_future<Result>. In fact,sincecompile handles the results of a compilation itself, it definesreturn typevoid and thepackaged_task itself doesn't return anythingeither. Therefore theclass Task doesn't need aresult() memberanymore. Instead, its function-call operator, having completed its task, callsthe task'sget_future so exceptions that might have been generated bythepackaged_tasks are properly retrieved. Here's the simplifiedclassTask:
using PackagedTask = packaged_task<void (string const &fname)>; class Task { string d_command; PackagedTask d_task; public: Task() = default; Task(string const &command, PackagedTask &&tmp) : d_command(command), d_task(move(tmp)) {} void operator()() { d_task(d_command); d_task.get_future(); // handles potential exceptions } };At the end ofmain the functionresults is called:
1: void results() 2: { 3: g_out.seekg(0); 4: 5: int value; 6: string line; 7: string errorDisplay; 8: 9: while (g_out >> value >> line) // process g_out's content 10: { 11: g_out.ignore(100, '\n'); 12: 13: if (value == 0) // no error: show the source file 14: { 15: cerr << line << '\n'; 16: continue; 17: } 18: // at compilation errors: 19: if (not errorDisplay.empty()) // after the 1st error: skip 20: { 21: do 22: { 23: getline(g_out, line); 24: } 25: while (line != g_marker); 26: 27: continue; 28: } 29: // first compilation error: 30: errorDisplay = line + '\n'; // keep the the name of the source 31: while (true) // and its error messages 32: { 33: getline(g_out, line); 34: 35: if (line == g_marker) 36: break; 37: 38: errorDisplay += line + '\n'; 39: } 40: } 41: 42: cerr << errorDisplay; // eventually insert the error-info 43: } // (if any)while condition at line 9.errorDisplay (lines 30 thru 39).g_out has completely been readerrorDisplay is displayed (line 42), which is either empty or contains the error messages of the first encountered compilation failure.