structs is illustrated in various examples; theconcept of aclass is introduced; casting is covered in detail; many newtypes are introduced and several important notational extensions toC arediscussed.const is part of theC grammar, its use ismore important and much more common and strictly used inC++ than it is inC.Theconst keyword is a modifier stating that the value of a variableor of an argument may not be modified. In the following example the intent isto change the value of a variableival, which fails:
int main() { int const ival = 3; // a constant int // initialized to 3 ival = 4; // assignment produces // an error message }This example shows howival may be initialized to a given value in itsdefinition; attempts to change the value later (in an assignment) are notpermitted.
Variables that are declaredconst can, in contrast toC, be used tospecify the size of an array, as in the following example:
int const size = 20; char buf[size]; // 20 chars big
Another use of the keywordconst is seen in the declaration ofpointers, e.g., in pointer-arguments. In the declaration
char const *buf;
buf is a pointer variable pointing tochars. Whatever is pointedto bybuf may not be changed throughbuf: thechars are declaredasconst. The pointerbuf itself however may be changed. A statementlike*buf = 'a'; is therefore not allowed, while++buf is.
In the declaration
char *const buf;
buf itself is aconst pointer which may not be changed. Whateverchars are pointed to bybuf may be changed at will.
Finally, the declaration
char const *const buf;
is also possible; here, neither the pointer nor what it points to may bechanged.
Therule of thumb for the placement of the keywordconst is thefollowing: what's written to theleft ofconst may not be changed.
Although simple, this rule of thumb is, unfortunately, not often used. Forexample, Bjarne Stroustrup states (inhttps://www.stroustrup.com/bs_faq2.html#constplacement):
Should I put "const" before or after the type?But we've already seen an example where applying this simple `before'placement rule for the keywordI put it before, but that's a matter of taste. "const T" and "T const"were always (both) allowed and equivalent. For example:
const int a = 1; // OK int const b = 2; // also OKMy guess is that using the first version will confuse fewer programmers(``is more idiomatic'').
const produces unexpected (i.e., unwanted)results as we will shortly see (below). Furthermore, the `idiomatic'before-placement also conflicts with the notion ofconst functions, whichwe will encounter in section7.7. With const functions thekeywordconst is also placed behind rather than before the name of thefunction.The definition or declaration (either or not containingconst) shouldalways be read from the variable or function identifier back to the typeidentifier:
``Buf is a const pointer to const characters''This rule of thumb is especially useful in cases where confusion mayoccur. In examples ofC++ code published in other places one oftenencounters the reverse:
constpreceding what should not bealtered. That this may result in sloppy code is indicated by our secondexample above:char const *buf;
What must remain constant here? According to the sloppy interpretation,the pointer cannot be altered (asconst precedes the pointer). In fact,the char values are the constant entities here, as becomes clear when we tryto compile the following program:
int main() { char const *buf = "hello"; ++buf; // accepted by the compiler *buf = 'u'; // rejected by the compiler }Compilation fails on the statement*buf = 'u'; andnot on thestatement++buf.
Marshall Cline'sC++ FAQ gives thesame rule (paragraph 18.5) , in a similar context:
[18.5] What's the difference between "const Fred* p", "Fred* const p" and"const Fred* const p"?Marshall Cline's advice can be improved, though. Here's a recipe that willeffortlessly dissect even the most complex declaration:You have to read pointer declarations right-to-left.
char const *(* const (*(*ip)())[])[] ip Start at the variable's name: 'ip' is ip) Hitting a closing paren: revert --> (*ip) Find the matching open paren: <- 'a pointer to' (*ip)()) The next unmatched closing par: --> 'a function (not expecting arguments)' (*(*ip)()) Find the matching open paren: <- 'returning a pointer to' (*(*ip)())[]) The next closing par: --> 'an array of' (* const (*(*ip)())[]) Find the matching open paren: <-------- 'const pointers to' (* const (*(*ip)())[])[] Read until the end: -> 'an array of' char const *(* const (*(*ip)())[])[] Read backwards what's left: <----------- 'pointers to const chars'
Collecting all the parts, we get forchar const *(* const(*(*ip)())[])[]:ip is a pointer to a function (not expecting arguments),returning a pointer to an array of const pointers to an array of pointers toconst chars. This is whatip represents; the recipe can be used to parseany declaration you ever encounter.
sin operating ondegrees, but does not want to losethe capability of using the standardsin function, operating onradians.Namespaces are covered extensively in chapter4. For now itshould be noted that most compilers require the explicit declaration of astandard namespace:std. So, unless otherwise indicated, it isstressed that all examples in the Annotations now implicitly use the
using namespace std;
declaration. So, if you actually intend to compile examples given intheC++ Annotations, make sure that the sources start with the aboveusingdeclaration.
::). This operator can beused in situations where a global variable exists having the same name as alocal variable: #include <stdio.h> double counter = 50; // global variable int main() { for (int counter = 1; // this refers to the counter != 10; // local variable ++counter) { printf("%d\n", ::counter // global variable / // divided by counter); // local variable } }In the above program the scope operator is used to address a globalvariable instead of the local variable having the same name. InC++ thescope operator is used extensively, but it is seldom used to reach a globalvariable shadowed by an identically named local variable. Its main purposeis encountered in chapter7.
#include <iostream> using namespace std; int main() { int ival; char sval[30]; cout << "Enter a number:\n"; cin >> ival; cout << "And now a string:\n"; cin >> sval; cout << "The number is: " << ival << "\n" "And the string is: " << sval << '\n'; }This program reads a number and a string from thecin stream (usuallythe keyboard) and prints these data tocout. With respect to streams,please note:
iostream. In the examples in theC++ Annotations this header file is often not mentioned explicitly. Nonetheless,itmust be included (either directly or indirectly) when these streams areused. Comparable to the use of theusing namespace std; clause, the readeris expected to#include <iostream> with all the examples in which thestandard streams are used.cout,cin andcerr are variables of so-calledclass-types. Such variables are commonly calledobjects. Classesare discussed in detail in chapter7 and are used extensively inC++.cin extracts data from a stream and copies theextracted information to variables (e.g.,ival in the above example) usingthe extraction operator (two consecutive> characters: >>). Later inthe Annotations we will describe how operators inC++ can perform quitedifferent actions than what they are defined to do by the language, as is thecase here. Function overloading has already been mentioned. InC++operators can also have multiple definitions, which is calledoperatoroverloading.cin,cout andcerr (i.e.,>> and <<) also manipulate variables of different types. In theabove examplecout <<ival results in the printing of an integervalue, whereascout <<"Enter a number" results in the printingof a string. The actions of the operators therefore depend on the types ofsupplied variables."\n" or'\n'. But when inserting theendl symbol the line is terminatedfollowed by the flushing of the stream's internal buffer. Thus,endl canusually be avoided in favor of'\n' resulting in somewhat more efficientcode.cin,cout andcerr are not part of theC++ grammar proper. The streams are part of the definitions in the headerfileiostream. This is comparable to functions likeprintf that arenot part of theC grammar, but were originally written by people whoconsidered such functions important and collected them in a run-time library.A program may still use the old-style functions likeprintf andscanfrather than the new-style streams. The two styles can even be mixed. Butstreams offer several clear advantages and in manyC++ programs havecompletely replaced the old-styleC functions. Some advantages of usingstreams are:
printf andscanf can define wrong format specifiers for their arguments, for whichthe compiler sometimes can't warn. In contrast, argument checking withcin,cout andcerr is performed by the compiler. Consequently itisn't possible to err by providing anint argument in places where,according to the format string, a string argument should appear. With streamsthere are no format strings.printf andscanf (and other functions usingformat strings) in fact implement amini-language which is interpreted atrun-time. In contrast, with streams theC++ compiler knows exactly whichin- or output action to perform given the arguments used. No mini-languagehere.printf cannot be extended.cin, coutandcerr. In chapter6iostreams are covered ingreater detail. Even thoughprintf and friends can still be used inC++ programs, streams have practically replaced the old-styleCI/O functions likeprintf. If youthink you still need to useprintf and related functions, think again: in that case you've probablynot yet completely grasped the possibilities of stream objects.structs (seesection2.5.13). Such functions are calledmember functions.This section briefly discusses how to define such functions.The code fragment below shows astruct having data fields for a person'sname and address. A functionprint is included in thestruct's definition:
struct Person { char name[80]; char address[80]; void print(); };When defining the member functionprint the structure's name(Person) and the scope resolution operator (::) are used:
void Person::print() { cout << "Name: " << name << "\n" "Address: " << address << '\n'; }The implementation ofPerson::print shows how the fields of thestruct can be accessed without using the structure's type name. Here thefunctionPerson::print prints a variablename. SincePerson::printis itself a part ofstructperson, the variablename implicitlyrefers to the same type.
Thisstruct Person could be used as follows:
Person person; strcpy(person.name, "Karel"); strcpy(person.address, "Marskramerstraat 33"); person.print();
The advantage of member functions is that the called functionautomatically accesses the data fields of the structure for which it wasinvoked. In the statementperson.print() the objectperson is the`substrate': the variablesname andaddress that are used in the codeofprint refer to the data stored in theperson object.
C++ has three keywords that are related to data hiding:private,protected andpublic. These keywords can be used in the definition ofstructs. The keywordpublic allows all subsequent fields of astructure to be accessed by all code; the keywordprivate only allows codethat is part of thestruct itself to access subsequent fields. The keywordprotected is discussed in chapter13, and is somewhatoutside of the scope of the current discussion.
In astruct all fields arepublic, unless explicitly stated otherwise.Using this knowledge we can expand thestructPerson:
struct Person { private: char d_name[80]; char d_address[80]; public: void setName(char const *n); void setAddress(char const *a); void print(); char const *name(); char const *address(); };As the data fieldsd_name andd_address are in aprivatesection they are only accessible to the member functions which are defined inthestruct: these are the functionssetName,setAddress etc.. Asan illustration consider the following code:
Person fbb; fbb.setName("Frank"); // OK, setName is public strcpy(fbb.d_name, "Knarf"); // error, x.d_name is privateData integrity is implemented as follows: the actual data of astructPerson are mentioned in the structure definition. The data are accessed bythe outside world using special functions that are also part of thedefinition. These member functions control all traffic between the data fieldsand other parts of the program and are therefore also called `interface'functions. The thus implemented data hiding is illustrated inFigure2.

setName andsetAddress are declared withchar const* parameters. This indicates that the functions will not alter the stringswhich are supplied as their arguments. Analogously, the membersnameandaddress returnchar const *s: the compiler prevents callers ofthose members from modifying the information made accessible through thereturn values of those members.Two examples of member functions of thestructPerson are shownbelow:
void Person::setName(char const *n) { strncpy(d_name, n, 79); d_name[79] = 0; } char const *Person::name() { return d_name; }The power of member functions and of the concept of data hiding resultsfrom the abilities of member functions to perform special tasks, e.g.,checking the validity of the data. In the above examplesetName copiesonly up to 79 characters from its argument to the data membername,thereby avoiding abuffer overflow.
Another illustration of the concept of data hiding is the following. As analternative to member functions that keep their data in memory a library couldbe developed featuring member functions storing data on file. To convert aprogram storingPerson structures in memory to one that stores thedata on disk no special modifications are required. After recompilationand linking the program to a new library it is converted from storagein memory to storage on disk. This example illustrates a broader concept thandata hiding; it illustratesencapsulation. Data hiding is a kind ofencapsulation. Encapsulation in general results in reduced coupling ofdifferent sections of a program. This in turn greatly enhances reusability andmaintainability of the resulting software. By having the structure encapsulatethe actual storage medium the program using the structure becomes independentof the actual storage medium that is used.
Though data hiding can be implemented usingstructs, more often (almostalways)classes are used instead. A class is a kind of struct, except thata class uses private access by default, whereas structs use public access bydefault. The definition of aclassPerson is therefore identical tothe one shown above, except that the keywordclass hasreplacedstruct while the initialprivate: clause can be omitted. Ourtypographic suggestion for class names (and other type names defined by theprogrammer) is to start with a capital character to be followed by theremainder of the type name using lower case letters (e.g.,Person).
struct, which then require a pointer to thestruct as one of their arguments. An imaginaryC header file showingthis concept is: /* definition of a struct PERSON This is C */ typedef struct { char name[80]; char address[80]; } PERSON; /* some functions to manipulate PERSON structs */ /* initialize fields with a name and address */ void initialize(PERSON *p, char const *nm, char const *adr); /* print information */ void print(PERSON const *p); /* etc.. */InC++, the declarations of the involved functions are put insidethe definition of thestruct orclass. The argument denotingwhichstruct is involved is no longer needed.
class Person { char d_name[80]; char d_address[80]; public: void initialize(char const *nm, char const *adr); void print(); // etc.. };InC++ thestruct parameter is not used. AC function callsuch as:
PERSON x; initialize(&x, "some name", "some address");
becomes inC++:
Person x; x.initialize("some name", "some address");int int_value; int &ref = int_value;
In the above example a variableint_value is defined. Subsequently areferenceref is defined, which (due to its initialization) refers to thesame memory location asint_value. In the definition ofref, thereference operator& indicates thatref is notitself anint but a reference to one. The two statements
++int_value; ++ref;
have the same effect: they incrementint_value's value. Whether thatlocation is calledint_value orref does not matter.
References serve an important function inC++ as a means to passmodifiable arguments to functions. E.g., in standardC, a function thatincreases the value of its argument by five and returning nothing needs apointer parameter:
void increase(int *valp) // expects a pointer { // to an int *valp += 5; } int main() { int x; increase(&x); // pass x's address }This construction canalso be used inC++ but the same effectis also achieved using a reference:
void increase(int &valr) // expects a reference { // to an int valr += 5; } int main() { int x; increase(x); // passed as reference }It is arguable whether code such as the above should be preferred overC's method, though. The statementincrease(x) suggests that notx itself but acopy is passed. Yet the value ofx changes becauseof the wayincrease() is defined. However, references can also be used topass objects that are only inspected (without the need for a copy or a const*) or to pass objects whose modification is an accepted side-effect of theiruse. In those cases using references are strongly preferred over existingalternatives like copy by value or passing pointers.
Behind the scenes references are implemented using pointers. So, as far asthe compiler is concerned references inC++ are just const pointers. Withreferences, however, the programmer does not need to know or to bother aboutlevels of indirection. An important distinction between plain pointers andreferences is of course that with references no indirection takes place. Forexample:
extern int *ip; extern int &ir; ip = 0; // reassigns ip, now a 0-pointer ir = 0; // ir unchanged, the int variable it refers to // is now 0.
In order to prevent confusion, we suggest to adhere to the following:
void some_func(int val){ cout << val << '\n';}int main(){ int x; some_func(x); // a copy is passed}void by_pointer(int *valp){ *valp += 5;}void by_reference(string const &str){ cout << str; // no modification of str}int main (){ int x = 7; by_pointer(&x); // a pointer is passed // x might be changed string str("hello"); by_reference(str); // str is not altered}References play an important role in cases where the argument is notchanged by the function but where it is undesirable to copy the argument toinitialize the parameter. Such a situation occurs when a large object ispassed as argument, or is returned by the function. In these cases thecopying operation tends to become a significant factor, as the entireobject must be copied. In these cases references are preferred.
If the argument isn't modified by the function, or if the caller shouldn'tmodify the returned information, theconst keyword should beused. Consider the following example:
struct Person // some large structure{ char name[80]; char address[90]; double salary;};Person person[50]; // database of persons // printperson expects a // reference to a structure // but won't change itvoid printperson (Person const &subject){ cout << "Name: " << subject.name << '\n' << "Address: " << subject.address << '\n';} // get a person by index valuePerson const &personIdx(int index){ return person[index]; // a reference is returned,} // not a copy of person[index]int main(){ Person boss; printperson(boss); // no pointer is passed, // so `boss' won't be // altered by the function printperson(personIdx(5)); // references, not copies // are passed here}Referencescould result in extremely `ugly' code. A function mayreturn a reference to a variable, as in the following example:
int &func() { static int value; return value; }This allows the use of the following constructions:
func() = 20; func() += func();
It is probably superfluous to note that such constructions should normallynot be used. Nonetheless, there are situations where it is useful to return areference. We have actually already seen an example of this phenomenon in ourprevious discussion of streams. In a statement likecout <<"Hello" <<'\n'; the insertion operator returns a reference tocout. So, in this statement first the"Hello" is inserted intocout, producing a reference tocout. Through this reference the'\n' is then inserted in thecout object, again producing a referencetocout, which is then ignored.
Several differences between pointers and references are pointed out in thenext list below:
int &ref;ref refer to?external. These references wereinitialized elsewhere.& is used with areference, the expression yields the address of the variable to which thereference applies. In contrast, ordinary pointers are variables themselves, sothe address of a pointer variable has nothing to do with the address of thevariable pointed to.const &types.C++ introduces a new reference type called anrvalue reference, which is defined astypename &&.The namervalue reference is derived from assignment statements, where thevariable to the left of the assignment operator is called anlvalue andthe expression to the right of the assignment operator is called anrvalue. Rvalues are often temporary, anonymous values, like valuesreturned by functions.
In this parlance theC++ reference should be considered anlvalue reference (using the notationtypename &). They can becontrasted torvalue references (using the notationtypename &&).
The key to understanding rvalue references is the concept of ananonymous variable. An anonymous variable has no name and this is thedistinguishing feature for the compiler to associate it automatically with anrvalue reference if it has a choice. Before introducing some interestingconstructions let's first have a look at some standard situations wherelvalue references are used. The following function returns a temporary(anonymous) value:
int intVal() { return 5; }AlthoughintVal's return value can be assigned to anintvariable it requires copying, which might become prohibitive whena function does not return anint but instead some large object. Areference orpointer cannot be used either to collect the anonymousreturn value as the return value won't survive beyond that. So the followingis illegal (as noted by the compiler):
int &ir = intVal(); // fails: refers to a temporary int const &ic = intVal(); // OK: immutable temporary int *ip = &intVal(); // fails: no lvalue available
Apparently it is not possible to modify the temporary returned byintVal. But now consider these functions:
void receive(int &value) // note: lvalue reference { cout << "int value parameter\n"; } void receive(int &&value) // note: rvalue reference { cout << "int R-value parameter\n"; }and let's call this function frommain:
int main() { receive(18); int value = 5; receive(value); receive(intVal()); }This program produces the following output:
int R-value parameter int value parameter int R-value parameter
The program's output shows the compiler selectingreceive(int &&value)in all cases where it receives an anonymousint as its argument. Note thatthis includesreceive(18): a value 18 has no name and thusreceive(int&&value) is called. Internally, it actually uses a temporary variable tostore the 18, as is shown by the following example which modifiesreceive:
void receive(int &&value) { ++value; cout << "int R-value parameter, now: " << value << '\n'; // displays 19 and 6, respectively. }Contrastingreceive(int &value) withreceive(int &&value) hasnothing to do withint &value not being a const reference. Ifreceive(int const &value) is used the same results are obtained. Bottomline: the compiler selects the overloaded function using the rvalue referenceif the function is passed an anonymous value.
The compiler runs into problems ifvoid receive(int &value) isreplaced byvoid receive(int value), though. When confronted with thechoice between a value parameter and a reference parameter (either lvalue orrvalue) it cannot make a decision and reports an ambiguity. In practicalcontexts this is not a problem. Rvalue references were added to the language inorder to be able to distinguish the two forms of references: named values(for which lvalue references are used) and anonymous values (for whichrvalue references are used).
It is this distinction that allows the implementation ofmove semantics andperfect forwarding. At this point the concept ofmove semantics cannot yet fully be discussed (but see section9.7for a more thorough discussion) but it is very well possible to illustratethe underlying ideas.
Consider the situation where a function returns astruct Data containing apointer to a dynamically allocated NTBS. We agree thatData objectsare only used after initialization, for which twoinit functionsare available. As an aside: whenData objects are no longer required thememory pointed at bytext must again be returned to the operatingsystem; assume that that task is properly performed.
struct Data { char *text; void init(char const *txt); // initialize text from txt void init(Data const &other) { text = strdup(other.text); } };There's also this interesting function:
Data dataFactory(char const *text);
Its implementation is irrelevant, but it returns a (temporary)Dataobject initialized withtext. Such temporary objects cease to exist oncethe statement in which they are created end.
Now we'll useData:
int main() { Data d1; d1.init(dataFactory("object")); }Here theinit function duplicates the NTBS stored in the temporaryobject. Immediately thereafter the temporary object ceases to exist. If youthink about it, then you realize that that's a bit over the top:
dataFactory function usesinit to initialize thetext variable of its temporaryData object. For that it usesstrdup;d1.init function thenalso usesstrdup to initialized1.text;strdup calls, but the temporaryData object thereafteris never used again.To handle cases like theservalue reference were introduced. We addthe following function to thestruct Data:
void init(Data &&tmp) { text = tmp.text; // (1) tmp.text = 0; // (2) }Now, when the compiler translatesd1.init(dataFactory("object")) itnotices thatdataFactory returns a (temporary) object, and because of thatit uses theinit(Data &&tmp) function. As we know that thetmp objectceases to exist after executing the statement in which it is used, thed1object (at (1))grabs the temporary object'stext value, and then (at(2)) assigns 0 toother.text so that the temporary object'sfree(text)action does no harm.
Thus,struct Data suddenly has becomemove-aware and implementsmove semantics, removing the (extra copy) drawback of the previousapproach, and instead of making an extra copy of the temporary object's NTBSthe pointer value is simply transferred to its new owner.
Historically, theC programming language distinguished betweenlvaluesandrvalues. The terminology was based on assignment expressions, wherethe expression to the left of the assignment operator receives a value (e.g.,it referred to a location in memory where a value could be written into, likea variable), while the expression to the right of the assignment operator onlyhad to represent a value (it could be a temporary variable, a constant valueor the value stored in a variable):
lvalue = rvalue;
C++ adds to this basic distinction several new ways of referring toexpressions:
lvalue: anlvalue inC++ has the same meaning as inC. It refers to a location where a value can be stored, like a variable, a reference to a variable, or a dereferenced pointer.xvalue: anxvalue indicates anexpiring value. An expiring value refers to anobject (cf. chapter7) just before its lifetime ends. Such objects normally have to make sure that resources they own (like dynamically allocated memory) also cease to exist, but such resources may, just before the object's lifetime ends, be moved to another location, thus preventing their destruction.glvalue: aglvalue is ageneralized lvalue. A generalized lvalue refers to anything that may receive a value. It is either an lvalue or an xvalue.prvalue: aprvalue is apure rvalue: a literal value (like1.2e3) or an immutable object (e.g., the value returned from a function returning a constantstd::string (cf. chapter5)).An expression's value is an xvalue if it is:
.* (pointer-to-member) expression (cf. chapter16) in which the left-hand side operand is an xvalue and the right-hand side operand is a pointer to a data member.Here is a small example. Consider this simple struct:
struct Demo { int d_value; };In addition we have these function declarations and definitions:
Demo &&operator+(Demo const &lhs, Demo const &rhs); Demo &&factory(); Demo demo; Demo &&rref = static_cast<Demo &&>(demo);
Expressions like
factory(); factory().d_value; static_cast<Demo &&>(demo); demo + demo
are xvalues. However, the expression
rref;
is an lvalue.
In many situations it's not particularly important to know what kind of lvalueor what kind of rvalue is actually used. In theC++ Annotations the termlhs (left hand side) is frequently used to indicate an operand that'swritten to the left of a binary operator, while the termrhs (right hand side) is frequently used to indicate an operand that'swritten to the right of a binary operator. Lhs and rhs operands could actuallybe gvalues (e.g., when representing ordinary variables), but they could alsobe prvalues (e.g., numeric values added together using the additionoperator). Whether or not lhs and rhs operands are gvalues or lvalues canalways be determined from the context in which they are used.
int values, thereby bypassingtype safety. E.g., values of different enumeration types may becompared for (in)equality, albeit through a (static) type cast.Another problem with the currentenum type is that their values are notrestricted to the enum type name itself, but to the scope where theenumeration is defined. As a consequence, two enumerations having the samescope cannot have identical names.
Such problems are solved by definingenum classes. Anenum class can be defined as in the followingexample:
enum class SafeEnum { NOT_OK, // 0, by implication OK = 10, MAYBE_OK // 11, by implication };Enum classes useint values by default, but the used value type caneasily be changed using the: type notation, as in:
enum class CharEnum: unsigned char { NOT_OK, OK };To use a value defined in an enum class its enumeration name must beprovided as well. E.g.,OK is not defined,CharEnum::OK is.
Using the data type specification (noting that it defaults toint) itis possible to use enum class forward declarations. E.g.,
enum Enum1; // Illegal: no size available enum Enum2: unsigned int; // Legal: explicitly declared type enum class Enum3; // Legal: default int type is used enum class Enum4: char; // Legal: explicitly declared type
A sequence of symbols of a strongly typed enumeration can also beindicated in aswitch using theellipsis syntax, as shown in the nextexample:
SafeEnum enumValue(); switch (enumValue()) { case SafeEnum::NOT_OK ... SafeEnum::OK: cout << "Status is known\n"; break; default: cout << "Status unknown\n"; break; }C++ extends this concept by introducing thetypeinitializer_list<Type> whereType is replaced by the type name ofthe values used in the initializer list. Initializer lists inC++ are,like their counterparts inC, recursive, so they can also be used withmulti-dimensional arrays, structs and classes.
Before using theinitializer_list the<initializer_list> header filemust be included.
Like inC, initializer lists consist of a list of values surrounded bycurly braces. But unlikeC,functions can define initializer listparameters. E.g.,
void values(std::initializer_list<int> iniValues) { }A function likevalues could be called as follows:
values({2, 3, 5, 7, 11, 13});The initializer list appears as an argument which is a list of valuessurrounded by curly braces. Due to the recursive nature of initializer lists atwo-dimensional series of values can also be passes, as shown in the nextexample:
void values2(std::initializer_list<std::initializer_list<int>> iniValues) {} values2({{1, 2}, {2, 3}, {3, 5}, {4, 7}, {5, 11}, {6, 13}});Initializer lists are constant expressions and cannot bemodified. However, theirsize and values may be retrieved using theirsize, begin, andend members as follows:
void values(initializer_list<int> iniValues) { cout << "Initializer list having " << iniValues.size() << "values\n"; for ( initializer_list<int>::const_iterator begin = iniValues.begin(); begin != iniValues.end(); ++begin ) cout << "Value: " << *begin << '\n'; }Initializer lists can also be used to initialize objects of classes(cf. section7.5, which also summarizes the facilities ofinitializer lists).
Implicit conversions, also callednarrowing conversions are not allowed when specifying values ofinitializer lists. Narrowing conversions are encountered when values are usedof a type whose range is larger than the type specified when defining theinitializer list. For example
float ordouble values to define initializer lists ofint values;float to define initializer lists offloat values;Some examples:
initializer_list<int> ii{ 1.2 }; // 1.2 isn't an int value initializer_list<unsigned> iu{ ~0ULL }; // unsigned long long doesn't fit struct Data { int d_first; double d_second; std::string d_third; }; Data data{ .d_first = 1, .d_third = "hello" };In this example,d_first andd_third are explicitly initialized,whiled_second is implicitly initialized to its default value (so: 0.0).
InC++ it is not allowed to reorder the initialization of members in adesginated initialization list. So,Data data{ .d_third = "hello", .d_first= 1 } is an error, butData data{ .d_third = "hello" } is OK, as there isno ordering conflict in the latter example (this also initializesd_firstandd_second to 0).
Likewise, a union can be initialized using designatedinitialization, as illustrated by the next example:
union Data { int d_first; double d_second; std::string *d_third; }; // initialize the union's d_third field: Data data{ .d_third = new string{ "hello" } };uint32_t value of IP4 packets contain:Rather than using complex bit and bit-shift operations, these fields insideintegral values can be specified using bit-fields. E.g.,
struct FirstIP4word { uint32_t version: 4; uint32_t header: 4; uint32_t tos: 8; uint32_t length: 16; };To total size of aFirstIP4word object is 32 bits, or four bytes. Toshow the version of aFirstIP4word first object, simply do:
cout << first.version << '\n';
and to set its header length to 10 simply do
first.header = 10;
Bit fields are already available inC. TheC++26 standard allows themto be initialized by default by using initialization expressions in theirdefinitions. E.g.,
struct FirstIP4word { uint32_t version: 4 = 1; // version now 1, by default uint32_t header: 4 = 10; // TCP header length now 10, by default uint32_t tos: 8; uint32_t length: 16; };The initialization expressions are evaluated when the object using thebit-fields is defined. Also, when a variable is used to initialize a bit-fieldthe variable must at least have been declared when the struct containingbit-fields is defined. E.g.,
extern int value; struct FirstIP4word { ... uint32_t length: 16 = value; // OK: value has been declared };auto can be used to simplify type definitions of variables andreturn types of functions if the compiler is able to determine the propertypes of such variables or functions.Usingauto as a storage class specifier is no longer supported byC++:a variable definition likeauto int var results in a compilation error.
The keywordauto is used in situations where it is very hard to determinethe variable's type. These situations are encountered, e.g., in the context oftemplates (cf. chapters18 until23). It is also usedin situations where a known type is a very long one but also automaticallyavailable to the compiler. In such cases the programmer usesauto to avoidhaving to type long type definitions.
At this point in the Annotations only simple examples can be given. Refer tosection21.1.2 for additional information aboutauto (and therelateddecltype function).
When defining and initializing a variableint variable = 5 the type of theinitializing expression is well known: it's anint, and unless theprogrammer's intentions are different this could be used to definevariable's type (a somewhat contrived example as in this case itreduces rather than improves the clarity of the code):
auto variable = 5;
However, it is attractive to useauto. In chapter5 theiterator concept is introduced (see also chapters12 and18). Iterators frequently have long type definitions, like
std::vector<std::string>::const_reverse_iterator
Functions may return objects having such types. Since the compiler knowsabout these types we may exploit this knowledge by usingauto. Assume thata functionbegin() is declared like this:
std::vector<std::string>::const_reverse_iterator begin();
Rather than writing a long variable definition (at// 1, below) a muchshorter definition (at// 2) can be used:
std::vector<std::string>::const_reverse_iterator iter = begin(); // 1 auto iter = begin(); // 2
It's also easy to define and initialize additional variables of suchtypes. When initializing such variablesiter can be used to initializethose variables, andauto can be used, so the compiler deduces theirtypes:
auto start = iter;
When defining variables usingauto the variable's type is deduced fromthe variable's initializing expression. Plain types and pointer types are usedas-is, but when the initializing expression is a reference type, then thereference's basic type (without the reference, omittingconst orvolatile specifications) is used.
If a reference type is required thenauto & orauto && can be used. Likewise,const and/or pointer specifications canbe used in combination with theauto keyword itself. Here are someexamples:
int value; auto another = value; // 'int another' is defined string const &text(); auto str = text(); // text's plain type is string, so // string str, NOT string const str // is defined str += "..."; // so, this is OK int *ip = &value; auto ip2 = ip; // int *ip2 is defined. int *const &ptr = ip; auto ip3 = ptr; // int *ip3 is defined, omitting const & auto const &ip4 = ptr; // int *const &ip4 is defined.
In the next to lastauto specification, the tokens (reading right toleft) from the reference to the basic type are omitted: hereconst & wasappended toptr's basic type (int *). Hence,int *ip2 is defined.
In the lastauto specificationauto also producesint *, butin the type definitionconst & is added to the type produced byauto,soint *const &ip4 is defined.
Theauto keyword can also be used to postpone the definition of afunction's return type. The declaration of a functionintArrPtr returninga pointer to arrays of 10ints looks like this:
int (*intArrPtr())[10];
Such a declaration is fairly complex. E.g., among other complexities itrequires `protection of the pointer' using parenthesesin combination with the function's parameter list. In situations like thesethe specification of the return type can be postponed using theautoreturn type, followed by the specification of the function's return type afterany other specification the function might receive (e.g., as a const member(cf. section7.7) or following itsnoexcept specification(cf. section23.8)).
Usingauto to declare the above function, the declaration becomes:
auto intArrPtr() -> int (*)[10];
A return type specification usingauto is called alate-specified return type.
Since the C++14 standard late return type specifications are no longerrequired for functions returningauto. Such functions can now simply bedeclared like this:
auto autoReturnFunction();
In this case some restrictions apply, both to the function definitions andthe function declarations:
auto cannot be used before thecompiler has seen their definitions. So they cannot be used after meredeclarations;auto are implemented as recursivefunction then at least one return statement must have been seen before the recursive call. E.g., auto fibonacci(size_t n) { if (n <= 1) return n; return fibonacci(n - 1) + fibonacci(n - 2); }doubles, ints, strings,etc. When functions need to return multiple values areturn by argumentconstruction is often used, where addresses of variables that live outside ofthe called function are passed to functions, allowing the functions to assignnew values to those variables.When multiple values should bereturned from a function astruct canbe used, butpairs (cf. section12.2) ortuples (cf. section22.6) can also be used. Here's a simple example, where a functionfun returns astruct having two data fields:
struct Return { int first; double second; }; Return fun() { return Return{ 1, 12.5 }; }(Briefly forward referencing to sections12.2 and22.6: thestruct definition can completely be omitted iffun returns apair ortuple. In those cases the following code remains valid.)
A functioncallingfun traditionally defines a variableof the same type asfun's return type, and then uses that variable'sfields to accessfirst andsecond. If you don't like the typing,auto can also be used:
int main() { auto r1 = fun(); cout << r1.first; }Instead of referring to the elements of the returnedstruct, pair ortuplestructured binding declarations can also be used. Here,autois followed by a (square brackets surrounded) comma-separated list ofvariables, where each variable isdefined, and receives the value of thecorresponding field or element of the called function's return value. So, theabovemain function can also be written like this:
int main() { auto [one, two] = fun(); cout << one; // one and two: now defined }Merely specifyingauto results infun's return value beingcopied, and the structured bindings variables will refer to the copied value. But structured binding declarations can also be used in combination with(lvalue/rvalue) return values. The following ensures thatrone andrtwo refer to the elements offun's anonymous return value:
int main() { auto &&[rone, rtwo] = fun(); }If the called function returns a value that survives the function callitself, then structured binding declarations can uselvaluereferences. E.g.,
Return &fun2() { static Return ret{ 4, 5 }; return ret; } int main() { auto &[lone, ltwo] = fun2(); // OK: referring to ret's fields }To use structured binding declarations it is not required to use functioncalls. The object providing the data can also anonymously be defined:
int main() { auto const &[lone, ltwo] = Return{ 4, 5 }; // or: auto &&[lone, ltwo] = Return{ 4, 5 }; }The object doesn't even have to make its data members publiclyavailable. In sectionTUPLES using structured bindings not necessarilyreferring to data members is covered.
Another application is found in situations where nested statements offor or selection statements benefit from using locally defined variablesof various types. Such variables can easily be defined using structuredbinding declarations that are initialized from anonymous structs, pairs ortuples. Here is an example illustrating this:
// define a struct: struct Three { size_t year; double firstAmount; double interest; }; // define an array of Three objects, and process each in turn: Three array[10]; fill(array); // not implemented here for (auto &[year, amount, interest]: array) cout << "Year " << year << ": amount = " << amount << '\n';When using structured bindings the structured binding declaration mustspecify all elements that are available. So if a struct has four data membersthe structured binding declaration must define four elements. To avoidwarnings of unused variables at lease one of the variables of the structuredbinding declaration must be used.
typedef is commonly used to define shorthand notations forcomplex types. Assume we want to define a shorthand for `a pointer to afunction expecting a double and an int, and returning an unsigned long longint'. Such a function could be:unsigned long long int compute(double, int);
A pointer to such a function has the following form:
unsigned long long int (*pf)(double, int);
If this kind of pointer is frequently used, consider defining it usingtypedef: simply puttypedef in front of it and the pointer's name isturned into the name of a type. It could be capitalized to let it stand outmore clearly as the name of a type:
typedef unsigned long long int (*PF)(double, int);
After having defined this type, it can be used to declare or define suchpointers:
PF pf = compute; // initialize the pointer to a function like // 'compute' void fun(PF pf); // fun expects a pointer to a function like // 'compute'
However, including the pointer in the typedef might not be a very goodidea, as it masks the fact thatpf is a pointer. After all,PF pflooks more like `int x' than `int *x'. To document thatpf isin fact a pointer, slightly change thetypedef:
typedef unsigned long long int FUN(double, int); FUN *pf = compute; // now pf clearly is a pointer.
The scope of typedefs is restricted to compilation units. Therefore,typedefs are usually embedded in header files which are then included bymultiple source files in which the typedefs should be used.
In addition totypedefC++ offers theusing keyword toassociate a type and an identifier. In practicetypedef andusing canbe used interchangeably. Theusing keyword arguably result in morereadable type definitions. Consider the following three (equivalent)definitions:
typedef unsigned long long int FUN(double, int);
using to improve the visibility (for humans) of the type name, by moving the type name to the front of the definition:using FUN = unsigned long long int (double, int);
using FUN = auto (double, int) -> unsigned long long int;
for (init; cond; inc) statement
Often the initialization, condition, and increment parts are fairlyobvious, as in situations where all elements of an array or vector must beprocessed. Many languages offer theforeach statement for that andC++offers thestd::for_each generic algorithm (cf. section19.1.18).
In addition to the traditional syntaxC++ adds new syntax for thefor-statement: therange-based for-loop. This new syntax can be used to process allelement of arange in turn. Three types of ranges are distinguished:
int array[10]);begin() andend() functions returning so-callediterators (cf. section18.2).// assume int array[30] for (auto &element: array) statement
The part to the left of the colon is called thefor range declaration. The declared variable (element) is aformal name; use any identifier you like. The variable is only availablewithin the nested statement, and it refers to (or is a copy of) each of theelements of the range, from the first element up to the last.
There's no formal requirement to useauto, but usingauto is extremelyuseful in many situations. Not only in situations where the range refers toelements of some complex type, but also in situations where you know what youcan do with the elements in the range, but don't care about their exact typenames. In the above exampleint could also have been used.
The reference symbol (&) is important in the following cases:
structs (orclasses, cf. chapter7)BigStruct elements: struct BigStruct { double array[100]; int last; };Inefficient, because you don't need to make copies of the array'selements. Instead, use references to elements:
BigStruct data[100]; // assume properly initialized elsewhere int countUsed() { int sum = 0; // const &: the elements aren't modified for (auto const &element: data) sum += element.last; return sum; }Range-based for-loops can also benefit from structured bindings. Ifstruct Element holds aint key and adouble value, and all thevalues of positive keys should be added then the following code snippetaccomplishes that:
Element elems[100]; // somehow initialized double sum = 0; for (auto const &[key, value]: elems) { if (key > 0) sum += value; }TheC++26 standard also supports an optional initialization section (like theones already available forif andswitch statements) for range-basedfor-loops. Assume the elements of an array must be inserted intocout, butbefore each element we want to display the element's index. The indexvariable is not used outside thefor-statement, and the extension offeredin theC++26 standard allows us to localize the index variable. Here is anexample:
// localize idx: only visible in the for-stmnt for (size_t idx = 0; auto const &element: data) cout << idx++ << ": " << element << '\n';
\n, \\ and\", and endingin 0-bytes. Such series of ASCII-characters are commonly known asnull-terminated byte strings (singular:NTBS, plural:NTBSs).C's NTBS is the foundation upon which an enormous amount of code hasbeen builtIn some cases it is attractive to be able to avoid having to use escapesequences (e.g., in the context of XML).C++ allows this usingraw string literals.
Raw string literals start with anR, followed by a double quote,optionally followed by a label (which is an arbitrary sequence of non-blankcharacters, followed by(). The raw string ends at theclosing parenthesis), followed by the label (if specified whenstarting the raw string literal), which is in turn followed by a doublequote. Here are some examples:
R"(A Raw \ "String")" R"delimiter(Another \ Raw "(String))delimiter"
In the first case, everything between"( and)" ispart of the string. Escape sequences aren't supported so the text\ "within the first raw string literal defines three characters: a backslash, ablank character and a double quote. The second example shows a raw stringdefined between the markers"delimiter( and)delimiter".
Raw string literals come in very handy when long, complex ascii-charactersequences (e.g., usage-info or long html-sequences) are used. In the end theyare just that: long NTBSs. Those long raw string literals should be separatedfrom the code that uses them, thus maintaining the readability of the usingcode.
As an illustration: thebisonc++ parser generator supports an option--prompt. When specified, the code generated bybisonc++ insertsprompting code when debugging is requested. Directly inserting the raw stringliteral into the function processing the prompting code results in code thatis very hard to read:
void prompt(ostream &out) { if (d_genDebug) out << (d_options.prompt() ? R"( if (d_debug__) { s_out__ << "\n================\n" "? " << dflush__; std::string s; getline(std::cin, s); } )" : R"( if (d_debug__) s_out__ << '\n'; )" ) << '\n'; }Readability is greatly enhanced by defining the raw string literals as namedNTBSs, defined in the source file's anonymous namespace (cf. chapter4):
namespace { char const noPrompt[] = R"( if (d_debug__) s_out__ << '\n'; )"; char const doPrompt[] = R"( if (d_debug__) { s_out__ << "\n================\n" "? " << dflush__; std::string s; getline(std::cin, s); } )"; } // anonymous namespace void prompt(ostream &out) { if (d_genDebug) out << (d_options.prompt() ? doPrompt : noPrompt) << '\n'; }0b or0B. E.g., to represent the (decimal) value 5 the notation0b101 canalso be used.The binary constants come in handy in the context of, e.g.,bit-flags, asit immediately shows which bit-fields are set, while other notations are lessinformative.
for repetition statements start with an optionalinitialization clause. The initialization clause allows us to localizevariables to the scope of the for statements. Initialization clauses can alsobe used in selection statements.Consider the situation where an action should be performed if the next lineread from the standard input stream equalsgo!. Traditionally, when usedinside a function, intending to localize the string to contain thecontent of the next line as much as possible, constructions like thefollowing had to be used:
void function() { // ... any set of statements { string line; // localize line if (getline(cin, line)) action(); } // ... any set of statements }Sinceinit ; clauses can also be used for selection statements (ifandswitch statements) (note that with selection statements the semicolonis part of the initialization clause, which is different from the optionalinit (no semicolon) clause infor statements), we can rephrase the above example as follows:
void function() { // ... any set of statements if (string line; getline(cin, line)) action(); // ... any set of statements }Note that a variable may still also be defined in the actual conditionclauses. This is true for both the extendedif andswitchstatement. However, before using the condition clauses an initializationclause may be used to define additional variables (plural, as it may contain acomma-separated list of variables, similar to the syntax that's available forfor-statements).
The following attributes are recognized:
[[carries_dependency]]:[[deprecated]]:[[deprecated("reason")]]) is available since the C++14 standard. Itindicates that the use of the name or entity declared with this attribute isallowed, but discouraged for some reason. This attribute can be used forclasses, typedef-names, variables, non-static data members, functions,enumerations, and template specializations. An existing non-deprecated entitymay be redeclared deprecated, but once an entity has been declared deprecatedit cannot be redeclared as `undeprecated'. When encountering the[[deprecated]] attribute the compiler generates a warning, e.g.,demo.cc:12:24: warning: 'void deprecatedFunction()' is deprecated [-Wdeprecated-declarations] deprecatedFunction(); demo.cc:5:21: note: declared here [[deprecated]] void deprecatedFunction()
When using the alternative form (e.g.,[[deprecated("do not use")]] void fun()) the compiler generates awarning showing the text between the double quotes, e.g.,
demo.cc:12:24: warning: 'void deprecatedFunction()' is deprecated: do not use [-Wdeprecated-declarations] deprecatedFunction(); demo.cc:5:38: note: declared here [[deprecated("do not use")]] void deprecatedFunction()[[fallthrough]]When statements nested undercase entries inswitch statementscontinue into subsequentcase ordefault entries the compiler issues a`falling through' warning. If falling through is intentional the attribute[[fallthrough]], which then must be followed by a semicolon, should beused. Here is an annotated example:
void function(int selector){ switch (selector) { case 1: case 2: // no falling through, but merged entry points cout << "cases 1 and 2\n"; [[fallthrough]]; // no warning: intentionally falling through case 3: cout << "case 3\n"; case 4: // a warning is issued: falling through not // announced. cout << "case 4\n"; [[fallthrough]]; // error: there's nothing beyond }}[[maybe_unused]]This attribute can be applied to a class, typedef-name, variable,parameter, non-static data member, a function, an enumeration or anenumerator. When it is applied to an entity no warning is generated when theentity is not used. Example:
void fun([[maybe_unused]] size_t argument){ // argument isn't used, but no warning // telling you so is issued}[[nodiscard]]The attribute[[nodiscard]] may be specified when declaring afunction, class or enumeration. If a function is declared[[nodiscard]] orif a function returns an entity previously declared using[[nodiscard]]then the return value of such a function may only be ignored when explicitlycast to void. Otherwise, when the return value is not used a warning isissued. Example:
int [[nodiscard]] importantInt();struct [[nodiscard]] ImportantStruct { ... };ImportantStruct factory(); int main(){ importantInt(); // warning issued factory(); // warning issued}[[noreturn]]:[[noreturn]] indicates that the function does notreturn.[[noreturn]]'s behavior is undefined if the function declared withthis attribute actually returns. The following standard functions have thisattribute:std::_Exit, std::abort, std::exit, std::quick_exit,std::unexpected, std::terminate, std::rethrow_exception,std::throw_with_nested, std::nested_exception::rethrow_nested, Here is anexample of a function declaration and definition using the[[noreturn]]attribute: [[noreturn]] void doesntReturn(); [[noreturn]] void doesntReturn() { exit(0); }C++26 standard added thethree-way comparison operator<=>, alsoknown as thespaceship operator, toC++. InC++ operators can bedefined for class-types, among which equality and comparison operators (thefamiliar set of==, !=, <, <=, > and>= operators). To provideclasses with all comparison operators merely the equality and the spaceshipoperator need to be defined.Its priority is less than the priorities of the bit-shiftoperators<< and>> and larger than the priorities of the orderingoperators<, <=, >, and>=.
Section11.7.2 covers the construction of the three-way comparisonoperator.
void, char,short, int, long, float anddouble.C++ extends these built-in typeswith several additional built-in types: the typesbool,wchar_t,long long andlong double (Cf.ANSI/ISO draft (1995),par. 27.6.2.4.1 for examples of these very long types). The typelong long is merely a double-longlong datatype. The typelong double is merely a double-longdouble datatype. These built-intypes as well as pointer variables are calledprimitive types in theC++ Annotations.There is a subtle issue to be aware of when converting applications developedfor 32-bit architectures to 64-bit architectures. When converting 32-bitprograms to 64-bit programs, onlylong types and pointer types change insize from 32 bits to 64 bits; integers of typeint remain at their size of32 bits. This may cause data truncation when assigning pointer orlongtypes toint types. Also, problems with sign extension can occur whenassigning expressions using types shorter than the size of anint to anunsigned long or to a pointer.
Except for these built-in types the class-typestring is availablefor handling character strings. The datatypesbool, andwchar_t arecovered in the following sections, the datatypestring is covered inchapter5. Note that recent versions ofC may also have adoptedsome of these newer data types (notablybool andwchar_t).Traditionally, however,C doesn't support them, hence they are mentionedhere.
Now that these new types are introduced, let's refresh your memory aboutletters that can be used inliteral constants of various types. They are:
b orB: in addition to its use as a hexadecimalvalue, it can also be used to define abinary constant. E.g.,0b101equals the decimal value 5. The0b prefix can be used to specify binaryconstants starting with the C++14 standard.E ore: theexponentiation character in floating point literal values. For example:1.23E+3. Here,E should be pronounced (and interpreted) as:times 10to the power. Therefore,1.23E+3 represents the value1230.F can be used aspostfix to anon-integral numeric constant to indicate a value of typefloat, ratherthandouble, which is the default. For example:12.F (the dottransforms 12 into a floating point value);1.23E+3F (see the previousexample.1.23E+3 is adouble value, whereas1.23E+3F is afloat value).L can be used asprefix toindicate a character string whose elements arewchar_t-typecharacters. For example:L"hello world".L can be used aspostfix to anintegral value to indicate a value of typelong, rather thanint,which is the default. Note that there is no letter indicating ashorttype. For that astatic_cast<short>() must be used.p, to specify the power inhexadecimal floating point numbers. E.g.0x10p4. The exponent itself isread as a decimal constant and can therefore not start with 0x. The exponentpart is interpreted as a power of 2. So0x10p2 is (decimal) equal to 64:16 * 2^2.U can be used aspostfix to anintegral value to indicate anunsigned value, rather than anint.It may also be combined with the postfixL to produce anunsigned longint value.x anda untilf characters can be used tospecify hexadecimal constants (optionally using capital letters).bool represents boolean (logical) values, for which the (nowreserved) constantstrue andfalse may be used. Except for thesereserved values, integral values may also be assigned to variables of typebool, which are then implicitly converted totrue andfalseaccording to the followingconversion rules (assumeintValue is anint-variable, andboolValue is abool-variable):// from int to bool: boolValue = intValue ? true : false; // from bool to int: intValue = boolValue ? 1 : 0;
Furthermore, whenbool values are inserted into streams thentrueis represented by1, andfalse is represented by0. Consider thefollowing example:
cout << "A true value: " << true << "\n" "A false value: " << false << '\n';
Thebool data type is found in other programming languages aswell.Pascal has its typeBoolean;Java has abooleantype. Different from these languages,C++'s typebool acts like a kindofint type. It is primarily a documentation-improving type, having justtwo valuestrue andfalse. Actually, these values can be interpretedasenum values for1 and0. Doing so would ignore the philosophybehind thebool data type, but nevertheless: assigningtrue to anint variable neither produces warnings nor errors.
Using thebool-type is usually clearer than usingint. Consider the following prototypes:
bool exists(char const *fileName); // (1) int exists(char const *fileName); // (2)
With the first prototype, readers expect the function toreturntrue if the given filename is the name of an existingfile. However, with the second prototype some ambiguity arises: intuitivelythe return value 1 is appealing, as it allows constructions like
if (exists("myfile")) cout << "myfile exists";On the other hand, many system functions (likeaccess,stat, andmany other) return 0 to indicate a successful operation, reserving othervalues to indicate various types of errors.
As a rule of thumb I suggest the following: if a function should informits caller about the success or failure of its task, let the function return abool value. If the function should return success or various types oferrors, let the function returnenum values, documenting the situation byits various symbolic constants. Only when the function returns a conceptuallymeaningful integral value (like the sum of twoint values), let thefunction return anint value.
wchar_t type is an extension of thechar built-in type, to accommodatewide character values (but see also the next section). Theg++compiler reportssizeof(wchar_t) as 4, which easily accommodates all 65,536differentUnicode character values.Note thatJava'schar data type is somewhat comparable toC++'swchar_t type.Java'schar type is 2 bytes wide, though. On theother hand,Java'sbyte data type is comparable toC++'schartype: one byte. Confusing?
L (e.g.,L"hello") defines awchar_t string literal.C++ also supports 8, 16 and 32 bitUnicode encodedstrings. Furthermore, two new data types are introduced:char16_t andchar32_t storing, respectively, aUTF-16 and aUTF-32 unicodevalue.
Achar type value fits in autf_8 unicode value. For character setsexceeding 256 different values wider types (likechar16_t orchar32_t)should be used.
String literals for the various types of unicode encodings (and associatedvariables) can be defined as follows:
char utf_8[] = u8"This is UTF-8 encoded."; char16_t utf16[] = u"This is UTF-16 encoded."; char32_t utf32[] = U"This is UTF-32 encoded.";
Alternatively, unicode constants may be defined using the\u escapesequence, followed by a hexadecimal value. Depending on the type of theunicode variable (or constant) aUTF-8, UTF-16 orUTF-32 value isused. E.g.,
char utf_8[] = u8"\u2018"; char16_t utf16[] = u"\u2018"; char32_t utf32[] = U"\u2018";
Unicode strings can be delimited by double quotes but raw string literalscan also be used.
long long int. On 32 bit systems it has atleast 64 usable bits.size_t type is not really a built-in primitive data type, but a datatype that is promoted byPOSIX as a typename to be used for non-negativeintegral values answering questions like `how much' and `how many', in whichcase it should be used instead ofunsigned int. It is not a specificC++ type, but also available in, e.g.,C. Usually it is definedimplicitly when a (any) system header file is included. The header file`officially' definingsize_t in the context ofC++ iscstddef.Usingsize_t has the advantage of being aconceptual type, rather thana standard type that is then modified by a modifier. Thus, it improvesthe self-documenting value of source code.
Several suffixes can be used to expicitly specify the intended representationof integral constants, like42UL defining 42 as anunsigned longint. Likewise, suffixesuz orzu can be used to specify that anintegral constant is represented as asize_t, as in:cout << 42uz.
Sometimes functions explictly requireunsigned int to be used. E.g., onamd-architectures theX-windows functionXQueryPointer explicitlyrequires a pointer to anunsigned int variable as one of its arguments. Insuch situations a pointer to asize_t variable can't be used, but theaddress of anunsigned int must be provided. Such situations areexceptional, though.
Other useful bit-represented types also exists. E.g.,uint32_t isguaranteed to hold 32-bits unsigned values. Analogously,int32_t holds32-bits signed values. Corresponding types exist for 8, 16 and 64 bitsvalues. These types are defined in the header filecstdint and can be veryuseful when you need to specify or use integral value types of fixed sizes.
char type has beenused for that, butchar is a signed type and when inserting acharvariable into a stream the character's representation instead of its value isused. Maybe more important is the inherent confusion when usingchar typevariables when only using its (unsigned) value: achar documents to thereader that text is used instead of mere 8-bit values, as used by the smallestaddressable memory locations.Different from thechar type thestd::byte type intends to merelyrepresent an 8-bit value. In order to usestd::byte the<cstddef>header file must be included.
Thebyte is defined as a strongly typed enum, simply embedding anunsigned char:
enum class byte: unsigned char {}; As abyte is an enum without predefined enum values plain assignmentscan only be used betweenbyte values.Byte variables can beinitialized using curly braces around an existingbyte or around fixedvalues of at most 8 bits (see #1 in the following example). If the specifiedvalue doesn't fit in 8 bits (#2) or if the specified value is neither abyte nor anunsigned char type variable (#3) the compiler reports anerror.Assignments or assignment-like initializations using rvalues which arebytes initialized using parentheses with values not fitting in 8 bits areaccepted (#4, #5). In these cases, the specified values are truncated to theirlowest 8 bits. Here are the illustrations:
byte value{ 0x23 }; // #1 (see the text above) // byte error{ 0x123 }; // #2 char ch = 0xfb; // byte error{ ch }; // #3 byte b1 = byte( ch ); // #4 value = byte( 0x123 ); // #5Thebyte type supports all bit-wise operations, but the right-hand operandof the bit-wise operator must also be abyte. E.g.,
value &= byte(0xf0);
Byte type values can also be ordered and compared for (in)equality.Unfortunately, no other operations are supported. E.g.,bytes cannot beadded and cannot be inserted into or extracted from streams, which somehowrenders thestd::byte less useful than ordinary types (likeunsignedint, uint16_t). When needed such operationscan be supported using casts(covered in section3.5), but it's considered good practice to avoidcasts whenever possible. However,C++ allows us to define a byte-type thatdoes behave like an ordinary numeric type, including and extracting itsvalues into and from streams. In section11.4 such a type is developed.
1'000'000 3.141'592'653'589'793'238'5 ''123 // won't compile 1''23 // won't compile either
(typename)expression
heretypename is the name of a validtype, andexpression is anexpression.
C style casts are now deprecated.C++ programs should merely usethe new styleC++ casts as they offer the compiler facilities to verifythe sensibility of the cast. Facilities which are not offered by the classicC-style cast.
A cast should not be confused with the often usedconstructor notation:
typename(expression)
the constructor notation is not a cast, but a request to the compiler toconstruct an (anonymous) variable of typetypename fromexpression.
If casts are really necessary one of severalnew-style casts should beused. These new-style casts are introduced in the upcoming sections.
static_cast<type>(expression) is used to convert`conceptually comparable or related types' to each other. Here as well as inotherC++ style caststype is the type to which the type ofexpression should be cast.Here are some examples of situations where thestatic_cast can (or should)be used:
int to adouble.This happens, for example when the quotient of twoint values must becomputed without losing the fraction part of the division. Thesqrtfunction called in the following fragment returns 2:
int x = 19;int y = 4;sqrt(x / y);
whereas it returns 2.179 when astatic_cast is used, as in:
sqrt(static_cast<double>(x) / y);
The important point to notice here is that astatic_cast is allowed tochange the representation of itsexpression into the representation that'sused by the destination type.
Also note that the division is put outside of the cast expression. If thedivision is performed within the cast'sexpression (as instatic_cast<double>(x / y)) aninteger division has already beenperformedbefore the cast has had a chance to convert the type of anoperand todouble.
enum values toint values (in anydirection).Here the two types use identical representations, but differentsemantics. Assigning an ordinaryenum value to anint doesn't requirea cast, but when the enum is astrongly typed enum a castisrequired. Conversely, astatic_cast is required when assigning anintvalue to a variable of some enum type. Here is an example:
enum class Enum{ VALUE};cout << static_cast<int>(Enum::VALUE); // show the numeric valueThestatic_cast is used in the context of class inheritance(cf. chapter13) to convert a pointer to a so-called `derivedclass' to a pointer to its `base class'. It cannot be used for castingunrelated types to each other (e.g., astatic_cast cannot be used tocast a pointer to ashort to a pointer to anint).
Avoid * is ageneric pointer. It is frequently used byfunctions in theC library (e.g.,memcpy(3)). Since it is the genericpointer it is related to any other pointer, and astatic_cast should beused to convert avoid * to an intended destination pointer. This is asomewhat awkward left-over fromC, which should probably only be used inthat context. Here is an example:
Theqsort function from theC library expects a pointer to a(comparison) function having twovoid const * parameters. In fact, theseparameters point to data elements of the array to be sorted, and so thecomparison function must cast thevoid const * parameters to pointers tothe elements of the array to be sorted. So, if the array is anint array[]and the compare function's parameters arevoid const *p1 andvoid const*p2 then the compare function obtains the address of theint pointedto byp1 by using:
static_cast<int const *>(p1);
int-typedvariable (remember that astatic_cast is allowed to change theexpression's representation!).Here is an example: theC functiontolower requires anintrepresenting the value of anunsigned char. Butchar by default is asigned type. To calltolower using an availablechar ch we should use:
tolower(static_cast<unsigned char>(ch))
const keyword has been given a special place in casting. Normallyanythingconst isconst for a good reason. Nonetheless situationsmay be encountered where theconst can be ignored. For these specialsituations theconst_cast should be used. Its syntax is:const_cast<type>(expression)
Aconst_cast<type>(expression) expression is used to undo theconst attribute of a (pointer) type.
The need for aconst_cast may occur in combination with functions fromthe standardC library which traditionally weren't always as const-awareas they should. A functionstrfun(char *s) might be available, performingsome operation on itschar *s parameter without actually modifying thecharacters pointed to bys. Passingchar const hello[] = "hello"; tostrfun produces the warning
passing `const char *' as argument 1 of `fun(char *)' discards const
Aconst_cast is the appropriate way to prevent the warning:
strfun(const_cast<char *>(hello));
reinterpret_cast. It is somewhat reminiscent of thestatic_cast, butreinterpret_cast should only be used when it isknown that the information as defined in fact is or can be interpreted assomething completely different. Its syntax is:reinterpret_cast<pointer type>(pointer expression)
Think of thereinterpret_cast as a cast offering a poor-man's union:the same memory location may be interpreted in completely different ways.
Thereinterpret_cast is used, for example, in combination with thewrite function that is available forstreams. InC++ streams arethe preferred interface to, e.g., disk-files. The standard streams likestd::cin andstd::cout also are stream objects.
Streams intended for writing (`output streams' likecout) offerwritemembers having the prototype
write(char const *buffer, int length)
To write the value stored within adouble variable to a stream in itsun-interpreted binary form the stream'swrite member is used. However, asadouble * and achar * point to variables using different andunrelated representations, astatic_cast cannot be used. In this case areinterpret_cast is required. To write the raw bytes of a variabledouble value tocout we use:
cout.write(reinterpret_cast<char const *>(&value), sizeof(double));
All casts are potentially dangerous, but thereinterpret_cast is themost dangerous of them all. Effectively we tell the compiler: back off, weknow what we're doing, so stop fuzzing. All bets are off, and we'd betterdo know what we're doing in situations like these. As a case in pointconsider the following code:
int value = 0x12345678; // assume a 32-bits int cout << "Value's first byte has value: " << hex << static_cast<int>( *reinterpret_cast<unsigned char *>(&value) );
The above code produces different results on little and big endiancomputers. Little endian computers show the value 78, big endiancomputers the value 12. Also note that the different representations used bylittle and big endian computers renders the previous example(cout.write(...)) non-portable over computers of different architectures.
As arule of thumb: if circumstances arise in which castshave to beused, clearly document the reasons for their use in your code, making doublesure that the cast does not eventually cause a program to misbehave. Also:avoidreinterpret_casts unless youhave to use them.
dynamic_cast<type>(expression)
Different from thestatic_cast, whose actions are completely determinedcompile-time, thedynamic_cast's actions are determinedrun-time toconvert a pointer to an object of some class (e.g.,Base) to a pointer toan object of another class (e.g.,Derived) which is found further down itsso-calledclass hierarchy (this is also calleddowncasting).
At this point in theAnnotations adynamic_cast cannot yet bediscussed extensively, but we return to this topic in section14.6.1.
In the context of the classshared_ptr, which is covered in section18.4, several more new-style casts are available. Actual coverage ofthese specialized casts is postponed until section18.4.5.
These specialized casts are:
static_pointer_cast, returning ashared_ptr to the base-classsection of a derived class object;const_pointer_cast, returning ashared_ptr to a non-const objectfrom ashared_ptr to a constant object;dynamic_pointer_cast, returning ashared_ptr to a derived classobject from ashared_ptr to a base class object.alignas char16_t double long reinterpret_cast true alignof char32_t dynamic_cast module requires try and class else mutable return typedef and_eq co_await enum namespace short typeid asm co_return explicit new signed typename atomic_cancel co_yield export noexcept sizeof union atomic_commit compl extern not static unsigned atomic_noexcept concept false not_eq static_assert using auto const float nullptr static_cast virtual bitand const_cast for operator struct void bitor constexpr friend or switch volatile bool continue goto or_eq synchronized wchar_t break decltype if private template while case default import protected this xor catch delete inline public thread_local xor_eq char do int register throw
Notes:
register is no longer used, but it remains a reserved identifier. In other words, definitions likeregister int index;
result in compilation errors. Also,register is no longer considered astorage class specifier (storage class specifiers areextern, thread_local, mutable andstatic).
and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor andxor_eq are symbolic alternatives for, respectively,&&, &=, &, |, ~, !,!=, ||, |=, ^ and^=.final, override,transaction_safe, andtransaction_safe_override. These identifiers arespecial in the sense that they acquire special meanings when declaring classesor polymorphic functions. Section14.4 provides further details.Keywords can only be used for their intended purpose and cannot be used asnames for other entities (e.g., variables, functions, class-names, etc.). Inaddition to keywordsidentifiers starting with an underscore and living intheglobal namespace (i.e., not using any explicit namespace or using themere:: namespace specification) or living in thestd namespace arereserved identifiers in the sense that their use is a prerogative of theimplementor.