Clang Language Extensions¶
Introduction¶
This document describes the language extensions provided by Clang. In additionto the language extensions listed here, Clang aims to support a broad range ofGCC extensions. Please see theGCC manual for more information onthese extensions.
Feature Checking Macros¶
Language extensions can be very useful, but only if you know you can depend onthem. In order to allow fine-grain features checks, we support three builtinfunction-like macros. This allows you to directly test for a feature in yourcode without having to resort to something like autoconf or fragile “compilerversion checks”.
__has_builtin
¶
This function-like macro takes a single identifier argument that is the name ofa builtin function, a builtin pseudo-function (taking one or more typearguments), or a builtin template.It evaluates to 1 if the builtin is supported or 0 if not.It can be used like this:
#ifndef __has_builtin// Optional of course.#define __has_builtin(x) 0// Compatibility with non-clang compilers.#endif...#if __has_builtin(__builtin_trap)__builtin_trap();#elseabort();#endif...
Note
Prior to Clang 10,__has_builtin
could not be used to detect most builtinpseudo-functions.
__has_builtin
should not be used to detect support for a builtin macro;use#ifdef
instead.
__has_constexpr_builtin
¶
This function-like macro takes a single identifier argument that is the name ofa builtin function, a builtin pseudo-function (taking one or more typearguments), or a builtin template.It evaluates to 1 if the builtin is supported and can be constant evaluated or0 if not. It can be used for writing conditionally constexpr code like this:
#ifndef __has_constexpr_builtin// Optional of course.#define __has_constexpr_builtin(x) 0// Compatibility with non-clang compilers.#endif...#if __has_constexpr_builtin(__builtin_fmax)constexpr#endifdoublemoney_fee(doubleamount){return__builtin_fmax(amount*0.03,10.0);}...
For example,__has_constexpr_builtin
is used in libcxx’s implementation ofthe<cmath>
header file to conditionally make a function constexpr wheneverthe constant evaluation of the corresponding builtin (for example,std::fmax
calls__builtin_fmax
) is supported in Clang.
__has_feature
and__has_extension
¶
These function-like macros take a single identifier argument that is the nameof a feature.__has_feature
evaluates to 1 if the feature is bothsupported by Clang and standardized in the current language standard or 0 ifnot (but seebelow), while__has_extension
evaluates to 1 if the feature is supported by Clang in thecurrent language (either as a language extension or a standard languagefeature) or 0 if not. They can be used like this:
#ifndef __has_feature// Optional of course.#define __has_feature(x) 0// Compatibility with non-clang compilers.#endif#ifndef __has_extension#define __has_extension __has_feature// Compatibility with pre-3.0 compilers.#endif...#if __has_feature(cxx_rvalue_references)// This code will only be compiled with the -std=c++11 and -std=gnu++11// options, because rvalue references are only standardized in C++11.#endif#if __has_extension(cxx_rvalue_references)// This code will be compiled with the -std=c++11, -std=gnu++11, -std=c++98// and -std=gnu++98 options, because rvalue references are supported as a// language extension in C++98.#endif
For backward compatibility,__has_feature
can also be used to testfor support for non-standardized features, i.e. features not prefixedc_
,cxx_
orobjc_
.
Another use of__has_feature
is to check for compiler features not relatedto the language standard, such as e.g.AddressSanitizer.
If the-pedantic-errors
option is given,__has_extension
is equivalentto__has_feature
.
The feature tag is described along with the language feature below.
The feature name or extension name can also be specified with a preceding andfollowing__
(double underscore) to avoid interference from a macro withthe same name. For instance,__cxx_rvalue_references__
can be used insteadofcxx_rvalue_references
.
__has_cpp_attribute
¶
This function-like macro is available in C++20 by default, and is provided as anextension in earlier language standards. It takes a single argument that is thename of a double-square-bracket-style attribute. The argument can either be asingle identifier or a scoped identifier. If the attribute is supported, anonzero value is returned. If the attribute is a standards-based attribute, thismacro returns a nonzero value based on the year and month in which the attributewas voted into the working draft. SeeWG21 SD-6for the list of values returned for standards-based attributes. If the attributeis not supported by the current compilation target, this macro evaluates to 0.It can be used like this:
#ifndef __has_cpp_attribute// For backwards compatibility#define __has_cpp_attribute(x) 0#endif...#if __has_cpp_attribute(clang::fallthrough)#define FALLTHROUGH [[clang::fallthrough]]#else#define FALLTHROUGH#endif...
The attribute scope tokensclang
and_Clang
are interchangeable, as arethe attribute scope tokensgnu
and__gnu__
. Attribute tokens in eitherof these namespaces can be specified with a preceding and following__
(double underscore) to avoid interference from a macro with the same name. Forinstance,gnu::__const__
can be used instead ofgnu::const
.
__has_c_attribute
¶
This function-like macro takes a single argument that is the name of anattribute exposed with the double square-bracket syntax in C mode. The argumentcan either be a single identifier or a scoped identifier. If the attribute issupported, a nonzero value is returned. If the attribute is not supported by thecurrent compilation target, this macro evaluates to 0. It can be used like this:
#ifndef __has_c_attribute// Optional of course.#define __has_c_attribute(x) 0// Compatibility with non-clang compilers.#endif...#if __has_c_attribute(fallthrough)#define FALLTHROUGH [[fallthrough]]#else#define FALLTHROUGH#endif...
The attribute scope tokensclang
and_Clang
are interchangeable, as arethe attribute scope tokensgnu
and__gnu__
. Attribute tokens in eitherof these namespaces can be specified with a preceding and following__
(double underscore) to avoid interference from a macro with the same name. Forinstance,gnu::__const__
can be used instead ofgnu::const
.
__has_attribute
¶
This function-like macro takes a single identifier argument that is the name ofa GNU-style attribute. It evaluates to 1 if the attribute is supported by thecurrent compilation target, or 0 if not. It can be used like this:
#ifndef __has_attribute// Optional of course.#define __has_attribute(x) 0// Compatibility with non-clang compilers.#endif...#if __has_attribute(always_inline)#define ALWAYS_INLINE __attribute__((always_inline))#else#define ALWAYS_INLINE#endif...
The attribute name can also be specified with a preceding and following__
(double underscore) to avoid interference from a macro with the same name. Forinstance,__always_inline__
can be used instead ofalways_inline
.
__has_declspec_attribute
¶
This function-like macro takes a single identifier argument that is the name ofan attribute implemented as a Microsoft-style__declspec
attribute. Itevaluates to 1 if the attribute is supported by the current compilation target,or 0 if not. It can be used like this:
#ifndef __has_declspec_attribute// Optional of course.#define __has_declspec_attribute(x) 0// Compatibility with non-clang compilers.#endif...#if __has_declspec_attribute(dllexport)#define DLLEXPORT __declspec(dllexport)#else#define DLLEXPORT#endif...
The attribute name can also be specified with a preceding and following__
(double underscore) to avoid interference from a macro with the same name. Forinstance,__dllexport__
can be used instead ofdllexport
.
__is_identifier
¶
This function-like macro takes a single identifier argument that might be eithera reserved word or a regular identifier. It evaluates to 1 if the argument is justa regular identifier and not a reserved word, in the sense that it can then beused as the name of a user-defined function or variable. Otherwise it evaluatesto 0. It can be used like this:
...#ifdef __is_identifier// Compatibility with non-clang compilers.#if __is_identifier(__wchar_t)typedefwchar_t__wchar_t;#endif#endif__wchar_tWideCharacter;...
Include File Checking Macros¶
Not all developments systems have the same include files. The__has_include and__has_include_next macros allowyou to check for the existence of an include file before doing a possiblyfailing#include
directive. Include file checking macros must be usedas expressions in#if
or#elif
preprocessing directives.
__has_include
¶
This function-like macro takes a single file name string argument that is thename of an include file. It evaluates to 1 if the file can be found using theinclude paths, or 0 otherwise:
// Note the two possible file name string formats.#if __has_include("myinclude.h") && __has_include(<stdint.h>)#include"myinclude.h"#endif
To test for this feature, use#ifdefined(__has_include)
:
// To avoid problem with non-clang compilers not having this macro.#if defined(__has_include)#if __has_include("myinclude.h")#include"myinclude.h"#endif#endif
__has_include_next
¶
This function-like macro takes a single file name string argument that is thename of an include file. It is like__has_include
except that it looks forthe second instance of the given file found in the include paths. It evaluatesto 1 if the second instance of the file can be found using the include paths,or 0 otherwise:
// Note the two possible file name string formats.#if __has_include_next("myinclude.h") && __has_include_next(<stdint.h>)# include_next "myinclude.h"#endif// To avoid problem with non-clang compilers not having this macro.#if defined(__has_include_next)#if __has_include_next("myinclude.h")# include_next "myinclude.h"#endif#endif
Note that__has_include_next
, like the GNU extension#include_next
directive, is intended for use in headers only, and will issue a warning ifused in the top-level compilation file. A warning will also be issued if anabsolute path is used in the file argument.
__has_warning
¶
This function-like macro takes a string literal that represents a command lineoption for a warning and returns true if that is a valid warning option.
#if __has_warning("-Wformat")...#endif
Builtin Macros¶
__BASE_FILE__
Defined to a string that contains the name of the main input file passed toClang.
__FILE_NAME__
Clang-specific extension that functions similar to
__FILE__
but onlyrenders the last path component (the filename) instead of an invocationdependent full path to that file.__COUNTER__
Defined to an integer value that starts at zero and is incremented each timethe
__COUNTER__
macro is expanded.__INCLUDE_LEVEL__
Defined to an integral value that is the include depth of the file currentlybeing translated. For the main file, this value is zero.
__TIMESTAMP__
Defined to the date and time of the last modification of the current sourcefile.
__clang__
Defined when compiling with Clang
__clang_major__
Defined to the major marketing version number of Clang (e.g., the 2 in2.0.1). Note that marketing version numbers should not be used to check forlanguage features, as different vendors use different numbering schemes.Instead, use theFeature Checking Macros.
__clang_minor__
Defined to the minor version number of Clang (e.g., the 0 in 2.0.1). Notethat marketing version numbers should not be used to check for languagefeatures, as different vendors use different numbering schemes. Instead, usetheFeature Checking Macros.
__clang_patchlevel__
Defined to the marketing patch level of Clang (e.g., the 1 in 2.0.1).
__clang_version__
Defined to a string that captures the Clang marketing version, including theSubversion tag or revision number, e.g., “
1.5(trunk102332)
”.__clang_literal_encoding__
Defined to a narrow string literal that represents the current encoding ofnarrow string literals, e.g.,
"hello"
. This macro typically expands to“UTF-8” (but may change in the future if the-fexec-charset="Encoding-Name"
option is implemented.)__clang_wide_literal_encoding__
Defined to a narrow string literal that represents the current encoding ofwide string literals, e.g.,
L"hello"
. This macro typically expands to“UTF-16” or “UTF-32” (but may change in the future if the-fwide-exec-charset="Encoding-Name"
option is implemented.)
Implementation-defined keywords¶
__datasizeof¶
__datasizeof
behaves likesizeof
, except that it returns the size of thetype ignoring tail padding.
_BitInt, _ExtInt¶
Clang supports the C23_BitInt(N)
feature as an extension in older C modesand in C++. This type was previously implemented in Clang with the samesemantics, but spelled_ExtInt(N)
. This spelling has been deprecated infavor of the standard type.
Note: the ABI for_BitInt(N)
is still in the process of being stabilized,so this type should not yet be used in interfaces that require ABI stability.
C keywords supported in all language modes¶
Clang supports_Alignas
,_Alignof
,_Atomic
,_Complex
,_Generic
,_Imaginary
,_Noreturn
,_Static_assert
,_Thread_local
, and_Float16
in all language modes with the C semantics.
__alignof, __alignof__¶
__alignof
and__alignof__
return, in contrast to_Alignof
andalignof
, the preferred alignment of a type. This may be larger than therequired alignment for improved performance.
__extension__¶
__extension__
suppresses extension diagnostics in the statement it isprepended to.
__auto_type¶
__auto_type
behaves the same asauto
in C++11 but is available in alllanguage modes.
__imag, __imag__¶
__imag
and__imag__
can be used to get the imaginary part of a complexvalue.
__real, __real__¶
__real
and__real__
can be used to get the real part of a complex value.
__asm, __asm__¶
__asm
and__asm__
are alternate spellings forasm
, but available inall language modes.
__complex, __complex__¶
__complex
and__complex__
are alternate spellings for_Complex
.
__const, __const__, __volatile, __volatile__, __restrict, __restrict__¶
These are alternate spellings for their non-underscore counterparts, but areavailable in all language modes.
__decltype¶
__decltype
is an alternate spelling fordecltype
, but is also availablein C++ modes before C++11.
__inline, __inline__¶
__inline
and__inline__
are alternate spellings forinline
, but areavailable in all language modes.
__nullptr¶
__nullptr
is an alternate spelling fornullptr
. It is available in all C and C++ language modes.
__signed, __signed__¶
__signed
and__signed__
are alternate spellings forsigned
.__unsigned
and__unsigned__
arenot supported.
__typeof, __typeof__, __typeof_unqual, __typeof_unqual__¶
__typeof
and__typeof__
are alternate spellings fortypeof
, but areavailable in all language modes. These spellings result in the operand,retaining all qualifiers.
__typeof_unqual
and__typeof_unqual__
are alternate spellings for theC23typeof_unqual
type specifier, but are available in all language modes.These spellings result in the type of the operand, stripping all qualifiers.
__char16_t, __char32_t¶
__char16_t
and__char32_t
are alternate spellings forchar16_t
andchar32_t
respectively, but are also available in C++ modes before C++11.They are only supported in C++.__char8_t
is not available.
Vectors and Extended Vectors¶
Supports the GCC, OpenCL, AltiVec, NEON and SVE vector extensions.
OpenCL vector types are created using theext_vector_type
attribute. Itsupports theV.xyzw
syntax and other tidbits as seen in OpenCL. An exampleis:
typedeffloatfloat4__attribute__((ext_vector_type(4)));typedeffloatfloat2__attribute__((ext_vector_type(2)));float4foo(float2a,float2b){float4c;c.xz=a;c.yw=b;returnc;}
Query for this feature with__has_attribute(ext_vector_type)
.
Giving-maltivec
option to clang enables support for AltiVec vector syntaxand functions. For example:
vectorfloatfoo(vectorinta){vectorintb;b=vec_add(a,a)+a;return(vectorfloat)b;}
NEON vector types are created usingneon_vector_type
andneon_polyvector_type
attributes. For example:
typedef__attribute__((neon_vector_type(8)))int8_tint8x8_t;typedef__attribute__((neon_polyvector_type(16)))poly8_tpoly8x16_t;int8x8_tfoo(int8x8_ta){int8x8_tv;v=a;returnv;}
GCC vector types are created using thevector_size(N)
attribute. TheargumentN
specifies the number of bytes that will be allocated for anobject of this type. The size has to be multiple of the size of the vectorelement type. For example:
// OK: This declares a vector type with four 'int' elementstypedefintint4__attribute__((vector_size(4*sizeof(int))));// ERROR: '11' is not a multiple of sizeof(int)typedefintint_impossible__attribute__((vector_size(11)));int4foo(int4a){int4v;v=a;returnv;}
Boolean Vectors¶
Clang also supports the ext_vector_type attribute with boolean element types inC and C++. For example:
// legal for Clang, error for GCC:typedefboolbool4__attribute__((ext_vector_type(4)));// Objects of bool4 type hold 8 bits, sizeof(bool4) == 1bool4foo(bool4a){bool4v;v=a;returnv;}
Boolean vectors are a Clang extension of the ext vector type. Boolean vectorsare intended, though not guaranteed, to map to vector mask registers. The sizeparameter of a boolean vector type is the number of bits in the vector. Theboolean vector is dense and each bit in the boolean vector is one vectorelement.
The semantics of boolean vectors borrows from C bit-fields with the followingdifferences:
Distinct boolean vectors are always distinct memory objects (there is nopacking).
Only the operators?:,!,~,|,&,^ and comparison are allowed onboolean vectors.
Casting a scalar bool value to a boolean vector type means broadcasting thescalar value onto all lanes (same as general ext_vector_type).
It is not possible to access or swizzle elements of a boolean vector(different than general ext_vector_type).
The size and alignment are both the number of bits rounded up to the next powerof two, but the alignment is at most the maximum vector alignment of thetarget.
Vector Literals¶
Vector literals can be used to create vectors from a set of scalars, orvectors. Either parentheses or braces form can be used. In the parenthesesform the number of literal values specified must be one, i.e. referring to ascalar value, or must match the size of the vector type being created. If asingle scalar literal value is specified, the scalar literal value will bereplicated to all the components of the vector type. In the brackets form anynumber of literals can be specified. For example:
typedefintv4si__attribute__((__vector_size__(16)));typedeffloatfloat4__attribute__((ext_vector_type(4)));typedeffloatfloat2__attribute__((ext_vector_type(2)));v4sivsi=(v4si){1,2,3,4};float4vf=(float4)(1.0f,2.0f,3.0f,4.0f);vectorintvi1=(vectorint)(1);// vi1 will be (1, 1, 1, 1).vectorintvi2=(vectorint){1};// vi2 will be (1, 0, 0, 0).vectorintvi3=(vectorint)(1,2);// errorvectorintvi4=(vectorint){1,2};// vi4 will be (1, 2, 0, 0).vectorintvi5=(vectorint)(1,2,3,4);float4vf=(float4)((float2)(1.0f,2.0f),(float2)(3.0f,4.0f));
Vector Operations¶
The table below shows the support for each operation by vector extension. Adash indicates that an operation is not accepted according to a correspondingspecification.
Operator | OpenCL | AltiVec | GCC | NEON | SVE |
---|---|---|---|---|---|
[] | yes | yes | yes | yes | yes |
unary operators +, – | yes | yes | yes | yes | yes |
++, – – | yes | yes | yes | no | no |
+,–,*,/,% | yes | yes | yes | yes | yes |
bitwise operators &,|,^,~ | yes | yes | yes | yes | yes |
>>,<< | yes | yes | yes | yes | yes |
!, &&, || | yes | – | yes | yes | yes |
==, !=, >, <, >=, <= | yes | yes | yes | yes | yes |
= | yes | yes | yes | yes | yes |
?:[1] | yes | – | yes | yes | yes |
sizeof | yes | yes | yes | yes | yes[2] |
C-style cast | yes | yes | yes | no | no |
reinterpret_cast | yes | no | yes | no | no |
static_cast | yes | no | yes | no | no |
const_cast | no | no | no | no | no |
address &v[i] | no | no | no[3] | no | no |
See also__builtin_shufflevector,__builtin_convertvector.
[1]ternary operator(?:) has different behaviors depending on conditionoperand’s vector type. If the condition is a GNU vector (i.e. __vector_size__),a NEON vector or an SVE vector, it’s only available in C++ and uses normal boolconversions (that is, != 0).If it’s an extension (OpenCL) vector, it’s only available in C and OpenCL C.And it selects base on signedness of the condition operands (OpenCL v1.1 s6.3.9).
[2]sizeof can only be used on vector length specific SVE types.
[3]Clang does not allow the address of an element to be taken while GCCallows this. This is intentional for vectors with a boolean element type andnot implemented otherwise.
Vector Builtins¶
Note: The implementation of vector builtins is work-in-progress and incomplete.
In addition to the operators mentioned above, Clang provides a set of builtinsto perform additional operations on certain scalar and vector types.
LetT
be one of the following types:
an integer type (as in C23 6.2.5p22), but excluding enumerated types and
bool
the standard floating types float or double
a half-precision floating point type, if one is supported on the target
a vector type.
For scalar types, consider the operation applied to a vector with a single element.
Vector SizeTo determine the number of elements in a vector, use__builtin_vectorelements()
.For fixed-sized vectors, e.g., defined via__attribute__((vector_size(N)))
or ARMNEON’s vector types (e.g.,uint16x8_t
), this returns the constant number ofelements at compile-time. For scalable vectors, e.g., SVE or RISC-V V, the number ofelements is not known at compile-time and is determined at runtime. This builtin canbe used, e.g., to increment the loop-counter in vector-type agnostic loops.
Elementwise Builtins
Each builtin returns a vector equivalent to applying the specified operationelementwise to the input.
Unless specified otherwise operation(±0) = ±0 and operation(±infinity) = ±infinity
The integer elementwise intrinsics, including__builtin_elementwise_popcount
,__builtin_elementwise_bitreverse
,__builtin_elementwise_add_sat
,__builtin_elementwise_sub_sat
can be called in aconstexpr
context.
No implicit promotion of integer types takes place. The mixing of integer typesof different sizes and signs is forbidden in binary and ternary builtins.
Name | Operation | Supported element types |
---|---|---|
T __builtin_elementwise_abs(T x) | return the absolute value of a number x; the absolute value ofthe most negative integer remains the most negative integer | signed integer and floating point types |
T __builtin_elementwise_fma(T x, T y, T z) | fused multiply add, (x * y) + z. | floating point types |
T __builtin_elementwise_ceil(T x) | return the smallest integral value greater than or equal to x | floating point types |
T __builtin_elementwise_sin(T x) | return the sine of x interpreted as an angle in radians | floating point types |
T __builtin_elementwise_cos(T x) | return the cosine of x interpreted as an angle in radians | floating point types |
T __builtin_elementwise_tan(T x) | return the tangent of x interpreted as an angle in radians | floating point types |
T __builtin_elementwise_asin(T x) | return the arcsine of x interpreted as an angle in radians | floating point types |
T __builtin_elementwise_acos(T x) | return the arccosine of x interpreted as an angle in radians | floating point types |
T __builtin_elementwise_atan(T x) | return the arctangent of x interpreted as an angle in radians | floating point types |
T __builtin_elementwise_atan2(T y, T x) | return the arctangent of y/x | floating point types |
T __builtin_elementwise_sinh(T x) | return the hyperbolic sine of angle x in radians | floating point types |
T __builtin_elementwise_cosh(T x) | return the hyperbolic cosine of angle x in radians | floating point types |
T __builtin_elementwise_tanh(T x) | return the hyperbolic tangent of angle x in radians | floating point types |
T __builtin_elementwise_floor(T x) | return the largest integral value less than or equal to x | floating point types |
T __builtin_elementwise_log(T x) | return the natural logarithm of x | floating point types |
T __builtin_elementwise_log2(T x) | return the base 2 logarithm of x | floating point types |
T __builtin_elementwise_log10(T x) | return the base 10 logarithm of x | floating point types |
T __builtin_elementwise_popcount(T x) | return the number of 1 bits in x | integer types |
T __builtin_elementwise_pow(T x, T y) | return x raised to the power of y | floating point types |
T __builtin_elementwise_bitreverse(T x) | return the integer represented after reversing the bits of x | integer types |
T __builtin_elementwise_exp(T x) | returns the base-e exponential, e^x, of the specified value | floating point types |
T __builtin_elementwise_exp2(T x) | returns the base-2 exponential, 2^x, of the specified value | floating point types |
T __builtin_elementwise_exp10(T x) | returns the base-10 exponential, 10^x, of the specified value | floating point types |
T __builtin_elementwise_sqrt(T x) | return the square root of a floating-point number | floating point types |
T __builtin_elementwise_roundeven(T x) | round x to the nearest integer value in floating point format,rounding halfway cases to even (that is, to the nearest valuethat is an even integer), regardless of the current roundingdirection. | floating point types |
T __builtin_elementwise_round(T x) | round x to the nearest integer value in floating point format,rounding halfway cases away from zero, regardless of thecurrent rounding direction. May raise floating-pointexceptions. | floating point types |
T __builtin_elementwise_trunc(T x) | return the integral value nearest to but no larger inmagnitude than x | floating point types |
T __builtin_elementwise_nearbyint(T x) | round x to the nearest integer value in floating point format,rounding according to the current rounding direction.May not raise the inexact floating-point exception. This istreated the same as | floating point types |
T __builtin_elementwise_rint(T x) | round x to the nearest integer value in floating point format,rounding according to the current roundingdirection. May raise floating-point exceptions. This is treatedthe same as | floating point types |
T __builtin_elementwise_canonicalize(T x) | return the platform specific canonical encodingof a floating-point number | floating point types |
T __builtin_elementwise_copysign(T x, T y) | return the magnitude of x with the sign of y. | floating point types |
T __builtin_elementwise_fmod(T x, T y) | return The floating-point remainder of (x/y) whose signmatches the sign of x. | floating point types |
T __builtin_elementwise_max(T x, T y) | return x or y, whichever is largerFor floating point types, follows semantics of maxNumin IEEE 754-2008. SeeLangReffor the comparison. | integer and floating point types |
T __builtin_elementwise_min(T x, T y) | return x or y, whichever is smallerFor floating point types, follows semantics of minNumin IEEE 754-2008. SeeLangReffor the comparison. | integer and floating point types |
T __builtin_elementwise_maxnum(T x, T y) | return x or y, whichever is larger. Follows IEEE 754-2008semantics (maxNum) with +0.0>-0.0. SeeLangReffor the comparison. | floating point types |
T __builtin_elementwise_minnum(T x, T y) | return x or y, whichever is smaller. Follows IEEE 754-2008semantics (minNum) with +0.0>-0.0. SeeLangReffor the comparison. | floating point types |
T __builtin_elementwise_add_sat(T x, T y) | return the sum of x and y, clamped to the range ofrepresentable values for the signed/unsigned integer type. | integer types |
T __builtin_elementwise_sub_sat(T x, T y) | return the difference of x and y, clamped to the range ofrepresentable values for the signed/unsigned integer type. | integer types |
T __builtin_elementwise_maximum(T x, T y) | return x or y, whichever is larger. Follows IEEE 754-2019semantics, seeLangReffor the comparison. | floating point types |
T __builtin_elementwise_minimum(T x, T y) | return x or y, whichever is smaller. Follows IEEE 754-2019semantics, seeLangReffor the comparison. | floating point types |
Reduction Builtins
Each builtin returns a scalar equivalent to applying the specifiedoperation(x, y) as recursive even-odd pairwise reduction to all vectorelements.operation(x,y)
is repeatedly applied to each non-overlappingeven-odd element pair with indicesi*2
andi*2+1
withiin[0,Numberofelements/2)
. If the numbers of elements is not apower of 2, the vector is widened with neutral elements for the reductionat the end to the next power of 2.
These reductions support both fixed-sized and scalable vector types.
The integer reduction intrinsics, including__builtin_reduce_max
,__builtin_reduce_min
,__builtin_reduce_add
,__builtin_reduce_mul
,__builtin_reduce_and
,__builtin_reduce_or
, and__builtin_reduce_xor
,can be called in aconstexpr
context.
Example:
__builtin_reduce_add([e3,e2,e1,e0])=__builtin_reduced_add([e3+e2,e1+e0])=(e3+e2)+(e1+e0)
LetVT
be a vector type andET
the element type ofVT
.
Name | Operation | Supported element types |
---|---|---|
ET __builtin_reduce_max(VT a) | return the largest element of the vector. The floating point resultwill always be a number unless all elements of the vector are NaN. | integer and floating point types |
ET __builtin_reduce_min(VT a) | return the smallest element of the vector. The floating point resultwill always be a number unless all elements of the vector are NaN. | integer and floating point types |
ET __builtin_reduce_add(VT a) | + | integer types |
ET __builtin_reduce_mul(VT a) | * | integer types |
ET __builtin_reduce_and(VT a) | & | integer types |
ET __builtin_reduce_or(VT a) | | | integer types |
ET __builtin_reduce_xor(VT a) | ^ | integer types |
ET __builtin_reduce_maximum(VT a) | return the largest element of the vector. Follows IEEE 754-2019semantics, seeLangReffor the comparison. | floating point types |
ET __builtin_reduce_minimum(VT a) | return the smallest element of the vector. Follows IEEE 754-2019semantics, seeLangReffor the comparison. | floating point types |
Matrix Types¶
Clang provides an extension for matrix types, which is currently beingimplemented. Seethe draft specification for more details.
For example, the code below uses the matrix types extension to multiply two 4x4float matrices and add the result to a third 4x4 matrix.
typedeffloatm4x4_t__attribute__((matrix_type(4,4)));m4x4_tf(m4x4_ta,m4x4_tb,m4x4_tc){returna+b*c;}
The matrix type extension also supports operations on a matrix and a scalar.
typedeffloatm4x4_t__attribute__((matrix_type(4,4)));m4x4_tf(m4x4_ta){return(a+23)*12;}
The matrix type extension supports division on a matrix and a scalar but not on a matrix and a matrix.
typedeffloatm4x4_t__attribute__((matrix_type(4,4)));m4x4_tf(m4x4_ta){a=a/3.0;returna;}
The matrix type extension supports compound assignments for addition, subtraction, and multiplication on matricesand on a matrix and a scalar, provided their types are consistent.
typedeffloatm4x4_t__attribute__((matrix_type(4,4)));m4x4_tf(m4x4_ta,m4x4_tb){a+=b;a-=b;a*=b;a+=23;a-=12;returna;}
The matrix type extension supports explicit casts. Implicit type conversion between matrix types is not allowed.
typedefintix5x5__attribute__((matrix_type(5,5)));typedeffloatfx5x5__attribute__((matrix_type(5,5)));fx5x5f1(ix5x5i,fx5x5f){return(fx5x5)i;}template<typenameX>usingmatrix_4_4=X__attribute__((matrix_type(4,4)));voidf2(){matrix_5_5<double>d;matrix_5_5<int>i;i=(matrix_5_5<int>)d;i=static_cast<matrix_5_5<int>>(d);}
Half-Precision Floating Point¶
Clang supports three half-precision (16-bit) floating point types:__fp16
,_Float16
and__bf16
. These types are supportedin all language modes, but their support differs between targets.A target is said to have “native support” for a type if the targetprocessor offers instructions for directly performing basic arithmeticon that type. In the absence of native support, a type can still besupported if the compiler can emulate arithmetic on the type by promotingtofloat
; see below for more information on this emulation.
__fp16
is supported on all targets. The special semantics of thistype mean that no arithmetic is ever performed directly on__fp16
values;see below._Float16
is supported on the following targets:32-bit ARM (natively on some architecture versions)
64-bit ARM (AArch64) (natively on ARMv8.2a and above)
AMDGPU (natively)
NVPTX (natively)
SPIR (natively)
X86 (if SSE2 is available; natively if AVX512-FP16 is also available)
RISC-V (natively if Zfh or Zhinx is available)
SystemZ (emulated)
LoongArch (emulated)
__bf16
is supported on the following targets (currently never natively):32-bit ARM
64-bit ARM (AArch64)
RISC-V
X86 (when SSE2 is available)
LoongArch
(For X86, SSE2 is available on 64-bit and all recent 32-bit processors.)
__fp16
and_Float16
both use the binary16 format from IEEE754-2008, which provides a 5-bit exponent and an 11-bit significand(counting the implicit leading 1).__bf16
uses thebfloat16 format,which provides an 8-bit exponent and an 8-bit significand; this is the sameexponent range asfloat, just with greatly reduced precision.
_Float16
and__bf16
follow the usual rules for arithmeticfloating-point types. Most importantly, this means that arithmetic operationson operands of these types are formally performed in the type and producevalues of the type.__fp16
does not follow those rules: most operationsimmediately promote operands of type__fp16
tofloat
, and soarithmetic operations are defined to be performed infloat
and so result ina value of typefloat
(unless further promoted because of other operands).See below for more information on the exact specifications of these types.
When compiling arithmetic on_Float16
and__bf16
for a target withoutnative support, Clang will perform the arithmetic infloat
, insertingextensions and truncations as necessary. This can be done in a way thatexactly matches the operation-by-operation behavior of native support,but that can require many extra truncations and extensions. By default,when emulating_Float16
and__bf16
arithmetic usingfloat
, Clangdoes not truncate intermediate operands back to their true type unless theoperand is the result of an explicit cast or assignment. This is generallymuch faster but can generate different results from strict operation-by-operationemulation. Usually the results are more precise. This is permitted by theC and C++ standards under the rules for excess precision in intermediate operands;see the discussion of evaluation formats in the C standard and [expr.pre] inthe C++ standard.
The use of excess precision can be independently controlled for these twotypes with the-ffloat16-excess-precision=
and-fbfloat16-excess-precision=
options. Valid values include:
none
: meaning to perform strict operation-by-operation emulationstandard
: meaning that excess precision is permitted under the rulesdescribed in the standard, i.e. never across explicit casts or statementsfast
: meaning that excess precision is permitted whenever theoptimizer sees an opportunity to avoid truncations; currently this has noeffect beyondstandard
The_Float16
type is an interchange floating type specified inISO/IEC TS 18661-3:2015 (“Floating-point extensions for C”). It willbe supported on more targets as they define ABIs for it.
The__bf16
type is a non-standard extension, but it generally followsthe rules for arithmetic interchange floating types from ISO/IEC TS18661-3:2015. In previous versions of Clang, it was a storage-only typethat forbade arithmetic operations. It will be supported on more targetsas they define ABIs for it.
The__fp16
type was originally an ARM extension and is specifiedby theARM C Language Extensions.Clang uses thebinary16
format from IEEE 754-2008 for__fp16
,not the ARM alternative format. Operators that expect arithmetic operandsimmediately promote__fp16
operands tofloat
.
It is recommended that portable code use_Float16
instead of__fp16
,as it has been defined by the C standards committee and has behavior that ismore familiar to most programmers.
Because__fp16
operands are always immediately promoted tofloat
, thecommon real type of__fp16
and_Float16
for the purposes of the usualarithmetic conversions isfloat
.
A literal can be given_Float16
type using the suffixf16
. For example,3.14f16
.
Because default argument promotion only applies to the standard floating-pointtypes,_Float16
values are not promoted todouble
when passed as variadicor untyped arguments. As a consequence, some caution must be taken when usingcertain library facilities with_Float16
; for example, there is noprintf
formatspecifier for_Float16
, and (unlikefloat
) it will not be implicitly promoted todouble
when passed toprintf
, so the programmer must explicitly cast it todouble
before using it with an%f
or similar specifier.
Messages ondeprecated
andunavailable
Attributes¶
An optional string message can be added to thedeprecated
andunavailable
attributes. For example:
voidexplode(void)__attribute__((deprecated("extremely unsafe, use 'combust' instead!!!")));
If the deprecated or unavailable declaration is used, the message will beincorporated into the appropriate diagnostic:
harmless.c:4:3: warning: 'explode' is deprecated: extremely unsafe, use 'combust' instead!!! [-Wdeprecated-declarations] explode(); ^
Query for this feature with__has_extension(attribute_deprecated_with_message)
and__has_extension(attribute_unavailable_with_message)
.
Attributes on Enumerators¶
Clang allows attributes to be written on individual enumerators. This allowsenumerators to be deprecated, made unavailable, etc. The attribute must appearafter the enumerator name and before any initializer, like so:
enumOperationMode{OM_Invalid,OM_Normal,OM_Terrified__attribute__((deprecated)),OM_AbortOnError__attribute__((deprecated))=4};
Attributes on theenum
declaration do not apply to individual enumerators.
Query for this feature with__has_extension(enumerator_attributes)
.
C++11 Attributes on using-declarations¶
Clang allows C++-style[[]]
attributes to be written on using-declarations.For instance:
[[clang::using_if_exists]]usingfoo::bar;usingfoo::baz[[clang::using_if_exists]];
You can test for support for this extension with__has_extension(cxx_attributes_on_using_declarations)
.
‘User-Specified’ System Frameworks¶
Clang provides a mechanism by which frameworks can be built in such a way thatthey will always be treated as being “system frameworks”, even if they are notpresent in a system framework directory. This can be useful to systemframework developers who want to be able to test building other applicationswith development builds of their framework, including the manner in which thecompiler changes warning behavior for system headers.
Framework developers can opt-in to this mechanism by creating a“.system_framework
” file at the top-level of their framework. That is, theframework should have contents like:
.../TestFramework.framework.../TestFramework.framework/.system_framework.../TestFramework.framework/Headers.../TestFramework.framework/Headers/TestFramework.h...
Clang will treat the presence of this file as an indicator that the frameworkshould be treated as a system framework, regardless of how it was found in theframework search path. For consistency, we recommend that such files never beincluded in installed versions of the framework.
Checks for Standard Language Features¶
The__has_feature
macro can be used to query if certain standard languagefeatures are enabled. The__has_extension
macro can be used to query iflanguage features are available as an extension when compiling for a standardwhich does not provide them. The features which can be tested are listed here.
Since Clang 3.4, the C++ SD-6 feature test macros are also supported.These are macros with names of the form__cpp_<feature_name>
, and areintended to be a portable way to query the supported features of the compiler.Seethe C++ status page forinformation on the version of SD-6 supported by each Clang release, and themacros provided by that revision of the recommendations.
C++98¶
The features listed below are part of the C++98 standard. These features areenabled by default when compiling C++ code.
C++ exceptions¶
Use__has_feature(cxx_exceptions)
to determine if C++ exceptions have beenenabled. For example, compiling code with-fno-exceptions
disables C++exceptions.
C++ RTTI¶
Use__has_feature(cxx_rtti)
to determine if C++ RTTI has been enabled. Forexample, compiling code with-fno-rtti
disables the use of RTTI.
C++11¶
The features listed below are part of the C++11 standard. As a result, allthese features are enabled with the-std=c++11
or-std=gnu++11
optionwhen compiling C++ code.
C++11 SFINAE includes access control¶
Use__has_feature(cxx_access_control_sfinae)
or__has_extension(cxx_access_control_sfinae)
to determine whetheraccess-control errors (e.g., calling a private constructor) are considered tobe template argument deduction errors (aka SFINAE errors), perC++ DR1170.
C++11 alias templates¶
Use__has_feature(cxx_alias_templates)
or__has_extension(cxx_alias_templates)
to determine if support for C++11’salias declarations and alias templates is enabled.
C++11 alignment specifiers¶
Use__has_feature(cxx_alignas)
or__has_extension(cxx_alignas)
todetermine if support for alignment specifiers usingalignas
is enabled.
Use__has_feature(cxx_alignof)
or__has_extension(cxx_alignof)
todetermine if support for thealignof
keyword is enabled.
C++11 attributes¶
Use__has_feature(cxx_attributes)
or__has_extension(cxx_attributes)
todetermine if support for attribute parsing with C++11’s square bracket notationis enabled.
C++11 generalized constant expressions¶
Use__has_feature(cxx_constexpr)
to determine if support for generalizedconstant expressions (e.g.,constexpr
) is enabled.
C++11decltype()
¶
Use__has_feature(cxx_decltype)
or__has_extension(cxx_decltype)
todetermine if support for thedecltype()
specifier is enabled. C++11’sdecltype
does not require type-completeness of a function call expression.Use__has_feature(cxx_decltype_incomplete_return_types)
or__has_extension(cxx_decltype_incomplete_return_types)
to determine ifsupport for this feature is enabled.
C++11 default template arguments in function templates¶
Use__has_feature(cxx_default_function_template_args)
or__has_extension(cxx_default_function_template_args)
to determine if supportfor default template arguments in function templates is enabled.
C++11default
ed functions¶
Use__has_feature(cxx_defaulted_functions)
or__has_extension(cxx_defaulted_functions)
to determine if support fordefaulted function definitions (with=default
) is enabled.
C++11 delegating constructors¶
Use__has_feature(cxx_delegating_constructors)
to determine if support fordelegating constructors is enabled.
C++11deleted
functions¶
Use__has_feature(cxx_deleted_functions)
or__has_extension(cxx_deleted_functions)
to determine if support for deletedfunction definitions (with=delete
) is enabled.
C++11 explicit conversion functions¶
Use__has_feature(cxx_explicit_conversions)
to determine if support forexplicit
conversion functions is enabled.
C++11 generalized initializers¶
Use__has_feature(cxx_generalized_initializers)
to determine if support forgeneralized initializers (using braced lists andstd::initializer_list
) isenabled.
C++11 implicit move constructors/assignment operators¶
Use__has_feature(cxx_implicit_moves)
to determine if Clang will implicitlygenerate move constructors and move assignment operators where needed.
C++11 inheriting constructors¶
Use__has_feature(cxx_inheriting_constructors)
to determine if support forinheriting constructors is enabled.
C++11 inline namespaces¶
Use__has_feature(cxx_inline_namespaces)
or__has_extension(cxx_inline_namespaces)
to determine if support for inlinenamespaces is enabled.
C++11 lambdas¶
Use__has_feature(cxx_lambdas)
or__has_extension(cxx_lambdas)
todetermine if support for lambdas is enabled.
C++11 local and unnamed types as template arguments¶
Use__has_feature(cxx_local_type_template_args)
or__has_extension(cxx_local_type_template_args)
to determine if support forlocal and unnamed types as template arguments is enabled.
C++11 noexcept¶
Use__has_feature(cxx_noexcept)
or__has_extension(cxx_noexcept)
todetermine if support for noexcept exception specifications is enabled.
C++11 in-class non-static data member initialization¶
Use__has_feature(cxx_nonstatic_member_init)
to determine whether in-classinitialization of non-static data members is enabled.
C++11nullptr
¶
Use__has_feature(cxx_nullptr)
or__has_extension(cxx_nullptr)
todetermine if support fornullptr
is enabled.
C++11overridecontrol
¶
Use__has_feature(cxx_override_control)
or__has_extension(cxx_override_control)
to determine if support for theoverride control keywords is enabled.
C++11 reference-qualified functions¶
Use__has_feature(cxx_reference_qualified_functions)
or__has_extension(cxx_reference_qualified_functions)
to determine if supportfor reference-qualified functions (e.g., member functions with&
or&&
applied to*this
) is enabled.
C++11 range-basedfor
loop¶
Use__has_feature(cxx_range_for)
or__has_extension(cxx_range_for)
todetermine if support for the range-based for loop is enabled.
C++11 raw string literals¶
Use__has_feature(cxx_raw_string_literals)
to determine if support for rawstring literals (e.g.,R"x(foo\bar)x"
) is enabled.
C++11 rvalue references¶
Use__has_feature(cxx_rvalue_references)
or__has_extension(cxx_rvalue_references)
to determine if support for rvaluereferences is enabled.
C++11static_assert()
¶
Use__has_feature(cxx_static_assert)
or__has_extension(cxx_static_assert)
to determine if support for compile-timeassertions usingstatic_assert
is enabled.
C++11thread_local
¶
Use__has_feature(cxx_thread_local)
to determine if support forthread_local
variables is enabled.
C++11 type inference¶
Use__has_feature(cxx_auto_type)
or__has_extension(cxx_auto_type)
todetermine C++11 type inference is supported using theauto
specifier. Ifthis is disabled,auto
will instead be a storage class specifier, as in Cor C++98.
C++11 strongly typed enumerations¶
Use__has_feature(cxx_strong_enums)
or__has_extension(cxx_strong_enums)
to determine if support for stronglytyped, scoped enumerations is enabled.
C++11 trailing return type¶
Use__has_feature(cxx_trailing_return)
or__has_extension(cxx_trailing_return)
to determine if support for thealternate function declaration syntax with trailing return type is enabled.
C++11 Unicode string literals¶
Use__has_feature(cxx_unicode_literals)
to determine if support for Unicodestring literals is enabled.
C++11 unrestricted unions¶
Use__has_feature(cxx_unrestricted_unions)
to determine if support forunrestricted unions is enabled.
C++11 user-defined literals¶
Use__has_feature(cxx_user_literals)
to determine if support foruser-defined literals is enabled.
C++11 variadic templates¶
Use__has_feature(cxx_variadic_templates)
or__has_extension(cxx_variadic_templates)
to determine if support forvariadic templates is enabled.
C++14¶
The features listed below are part of the C++14 standard. As a result, allthese features are enabled with the-std=C++14
or-std=gnu++14
optionwhen compiling C++ code.
C++14 binary literals¶
Use__has_feature(cxx_binary_literals)
or__has_extension(cxx_binary_literals)
to determine whetherbinary literals (for instance,0b10010
) are recognized. Clang supports thisfeature as an extension in all language modes.
C++14 contextual conversions¶
Use__has_feature(cxx_contextual_conversions)
or__has_extension(cxx_contextual_conversions)
to determine if the C++14 rulesare used when performing an implicit conversion for an array bound in anew-expression, the operand of adelete-expression, an integral constantexpression, or a condition in aswitch
statement.
C++14 decltype(auto)¶
Use__has_feature(cxx_decltype_auto)
or__has_extension(cxx_decltype_auto)
to determine if supportfor thedecltype(auto)
placeholder type is enabled.
C++14 default initializers for aggregates¶
Use__has_feature(cxx_aggregate_nsdmi)
or__has_extension(cxx_aggregate_nsdmi)
to determine if supportfor default initializers in aggregate members is enabled.
C++14 digit separators¶
Use__cpp_digit_separators
to determine if support for digit separatorsusing single quotes (for instance,10'000
) is enabled. At this time, thereis no corresponding__has_feature
name
C++14 generalized lambda capture¶
Use__has_feature(cxx_init_captures)
or__has_extension(cxx_init_captures)
to determine if support forlambda captures with explicit initializers is enabled(for instance,[n(0)]{return++n;}
).
C++14 generic lambdas¶
Use__has_feature(cxx_generic_lambdas)
or__has_extension(cxx_generic_lambdas)
to determine if support for generic(polymorphic) lambdas is enabled(for instance,[](autox){returnx+1;}
).
C++14 relaxed constexpr¶
Use__has_feature(cxx_relaxed_constexpr)
or__has_extension(cxx_relaxed_constexpr)
to determine if variabledeclarations, local variable modification, and control flow constructsare permitted inconstexpr
functions.
C++14 return type deduction¶
Use__has_feature(cxx_return_type_deduction)
or__has_extension(cxx_return_type_deduction)
to determine if supportfor return type deduction for functions (usingauto
as a return type)is enabled.
C++14 runtime-sized arrays¶
Use__has_feature(cxx_runtime_array)
or__has_extension(cxx_runtime_array)
to determine if supportfor arrays of runtime bound (a restricted form of variable-length arrays)is enabled.Clang’s implementation of this feature is incomplete.
C++14 variable templates¶
Use__has_feature(cxx_variable_templates)
or__has_extension(cxx_variable_templates)
to determine if support fortemplated variable declarations is enabled.
C++ type aware allocators¶
Use__has_extension(cxx_type_aware_allocators)
to determine the existence ofsupport for the future C++2d type aware allocator feature. For full details seeC++ Type Aware Allocators for additional details.
C11¶
The features listed below are part of the C11 standard. As a result, all thesefeatures are enabled with the-std=c11
or-std=gnu11
option whencompiling C code. Additionally, because these features are allbackward-compatible, they are available as extensions in all language modes.
C11 alignment specifiers¶
Use__has_feature(c_alignas)
or__has_extension(c_alignas)
to determineif support for alignment specifiers using_Alignas
is enabled.
Use__has_feature(c_alignof)
or__has_extension(c_alignof)
to determineif support for the_Alignof
keyword is enabled.
C11 atomic operations¶
Use__has_feature(c_atomic)
or__has_extension(c_atomic)
to determineif support for atomic types using_Atomic
is enabled. Clang also providesa set of builtins which can be used to implementthe<stdatomic.h>
operations on_Atomic
types. Use__has_include(<stdatomic.h>)
to determine if C11’s<stdatomic.h>
headeris available.
Clang will use the system’s<stdatomic.h>
header when one is available, andwill otherwise use its own. When using its own, implementations of the atomicoperations are provided as macros. In the cases where C11 also requires a realfunction, this header provides only the declaration of that function (alongwith a shadowing macro implementation), and you must link to a library whichprovides a definition of the function if you use it instead of the macro.
C11 generic selections¶
Use__has_feature(c_generic_selections)
or__has_extension(c_generic_selections)
to determine if support for genericselections is enabled.
As an extension, the C11 generic selection expression is available in alllanguages supported by Clang. The syntax is the same as that given in the C11standard.
In C, type compatibility is decided according to the rules given in theappropriate standard, but in C++, which lacks the type compatibility rules usedin C, types are considered compatible only if they are equivalent.
Clang also supports an extended form of_Generic
with a controlling typerather than a controlling expression. Unlike with a controlling expression, acontrolling type argument does not undergo any conversions and thus is suitablefor use when trying to match qualified types, incomplete types, or functiontypes. Variable-length array types lack the necessary compile-time informationto resolve which association they match with and thus are not allowed as acontrolling type argument.
Use__has_extension(c_generic_selection_with_controlling_type)
to determineif support for this extension is enabled.
C11_Static_assert()
¶
Use__has_feature(c_static_assert)
or__has_extension(c_static_assert)
to determine if support for compile-time assertions using_Static_assert
isenabled.
C11_Thread_local
¶
Use__has_feature(c_thread_local)
or__has_extension(c_thread_local)
to determine if support for_Thread_local
variables is enabled.
C2y¶
The features listed below are part of the C2y standard. As a result, all thesefeatures are enabled with the-std=c2y
or-std=gnu2y
option whencompiling C code.
C2y_Countof
¶
Use__has_feature(c_countof)
(in C2y or later mode) or__has_extension(c_countof)
(in C23 or earlier mode) to determine if supportfor the_Countof
operator is enabled. This feature is not available in C++mode.
Modules¶
Use__has_feature(modules)
to determine if Modules have been enabled.For example, compiling code with-fmodules
enables the use of Modules.
More information could be foundhere.
Language Extensions Back-ported to Previous Standards¶
Feature | Feature Test Macro | Introduced In | Backported To |
---|---|---|---|
variadic templates | __cpp_variadic_templates | C++11 | C++03 |
Alias templates | __cpp_alias_templates | C++11 | C++03 |
Non-static data member initializers | __cpp_nsdmi | C++11 | C++03 |
Range-based | __cpp_range_based_for | C++11 | C++03 |
RValue references | __cpp_rvalue_references | C++11 | C++03 |
Attributes | __cpp_attributes | C++11 | C++03 |
Lambdas | __cpp_lambdas | C++11 | C++03 |
Generalized lambda captures | __cpp_init_captures | C++14 | C++03 |
Generic lambda expressions | __cpp_generic_lambdas | C++14 | C++03 |
variable templates | __cpp_variable_templates | C++14 | C++03 |
Binary literals | __cpp_binary_literals | C++14 | C++03 |
Relaxed constexpr | __cpp_constexpr | C++14 | C++11 |
Static assert with no message | __cpp_static_assert >= 201411L | C++17 | C++11 |
Pack expansion in generalized lambda-capture | __cpp_init_captures | C++17 | C++03 |
| __cpp_if_constexpr | C++17 | C++11 |
fold expressions | __cpp_fold_expressions | C++17 | C++03 |
Lambda capture of *this by value | __cpp_capture_star_this | C++17 | C++03 |
Attributes on enums | __cpp_enumerator_attributes | C++17 | C++03 |
Guaranteed copy elision | __cpp_guaranteed_copy_elision | C++17 | C++03 |
Hexadecimal floating literals | __cpp_hex_float | C++17 | C++03 |
| __cpp_inline_variables | C++17 | C++03 |
Attributes on namespaces | __cpp_namespace_attributes | C++17 | C++11 |
Structured bindings | __cpp_structured_bindings | C++17 | C++03 |
template template arguments | __cpp_template_template_args | C++17 | C++03 |
Familiar template syntax for generic lambdas | __cpp_generic_lambdas | C++20 | C++03 |
| __cpp_multidimensional_subscript | C++20 | C++03 |
Designated initializers | __cpp_designated_initializers | C++20 | C++03 |
Conditional | __cpp_conditional_explicit | C++20 | C++03 |
| __cpp_using_enum | C++20 | C++03 |
| __cpp_if_consteval | C++23 | C++20 |
| __cpp_static_call_operator | C++23 | C++03 |
Attributes on Lambda-Expressions | C++23 | C++11 | |
Attributes on Structured Bindings | __cpp_structured_bindings | C++26 | C++03 |
Packs in Structured Bindings | __cpp_structured_bindings | C++26 | C++03 |
Structured binding declaration as a condition | __cpp_structured_bindings | C++26 | C++98 |
Static assert with user-generated message | __cpp_static_assert >= 202306L | C++26 | C++11 |
Pack Indexing | __cpp_pack_indexing | C++26 | C++03 |
| __cpp_deleted_function | C++26 | C++03 |
Variadic Friends | __cpp_variadic_friend | C++26 | C++03 |
Trivial Relocatability | __cpp_trivial_relocatability | C++26 | C++03 |
Designated initializers (N494) | C99 | C89 | |
Array & element qualification (N2607) | C23 | C89 | |
Attributes (N2335) | C23 | C89 | |
| C23 | C89, C++ | |
Octal literals prefixed with | C2y | C89, C++ | |
| C2y | C89 |
Builtin type aliases¶
Clang provides a few builtin aliases to improve the throughput of certain metaprogramming facilities.
__builtin_common_type¶
template<template<class...Args>classBaseTemplate,template<classTypeMember>classHasTypeMember,classHasNoTypeMember,class...Ts>using__builtin_common_type=...;
This alias is used for implementingstd::common_type
. Ifstd::common_type
should contain atype
member,it is an alias toHasTypeMember<TheCommonType>
. Otherwise it is an alias toHasNoTypeMember
. TheBaseTemplate
is usuallystd::common_type
.Ts
are the arguments tostd::common_type
.
__type_pack_element¶
template<std::size_tIndex,class...Ts>using__type_pack_element=...;
This alias returns the type atIndex
in the parameter packTs
.
__make_integer_seq¶
template<template<classIntSeqT,IntSeqT...Ints>classIntSeq,classT,TN>using__make_integer_seq=...;
This alias returnsIntSeq
instantiated withIntSeqT=T``and``Ints
being the pack0,...,N-1
.
Type Trait Primitives¶
Type trait primitives are special builtin constant expressions that can be usedby the standard C++ library to facilitate or simplify the implementation ofuser-facing type traits in the <type_traits> header.
They are not intended to be used directly by user code because they areimplementation-defined and subject to change – as such they’re tied closely tothe supported set of system headers, currently:
LLVM’s own libc++
GNU libstdc++
The Microsoft standard C++ library
Clang supports theGNU C++ type traits and a subset of theMicrosoft Visual C++ type traits,as well as nearly all of theEmbarcadero C++ type traits.
The following type trait primitives are supported by Clang. Those traits marked(C++) provide implementations for type traits specified by the C++ standard;__X(...)
has the same semantics and constraints as the correspondingstd::X_t<...>
orstd::X_v<...>
type trait.
__array_rank(type)
(Embarcadero):Returns the number of levels of array in the typetype
:0
iftype
is not an array type, and__array_rank(element)+1
iftype
is an array ofelement
.__array_extent(type,dim)
(Embarcadero):Thedim
’th array bound in the typetype
, or0
ifdim>=__array_rank(type)
.__builtin_is_implicit_lifetime
(C++, GNU, Microsoft)__builtin_is_virtual_base_of
(C++, GNU, Microsoft)__can_pass_in_regs
(C++)Returns whether a class can be passed in registers under the currentABI. This type can only be applied to unqualified class types.This is not a portable type trait.__has_nothrow_assign
(GNU, Microsoft, Embarcadero):Deprecated, use__is_nothrow_assignable
instead.__has_nothrow_move_assign
(GNU, Microsoft):Deprecated, use__is_nothrow_assignable
instead.__has_nothrow_copy
(GNU, Microsoft):Deprecated, use__is_nothrow_constructible
instead.__has_nothrow_constructor
(GNU, Microsoft):Deprecated, use__is_nothrow_constructible
instead.__has_trivial_assign
(GNU, Microsoft, Embarcadero):Deprecated, use__is_trivially_assignable
instead.__has_trivial_move_assign
(GNU, Microsoft):Deprecated, use__is_trivially_assignable
instead.__has_trivial_copy
(GNU, Microsoft):Deprecated, use__is_trivially_copyable
instead.__has_trivial_constructor
(GNU, Microsoft):Deprecated, use__is_trivially_constructible
instead.__has_trivial_move_constructor
(GNU, Microsoft):Deprecated, use__is_trivially_constructible
instead.__has_trivial_destructor
(GNU, Microsoft, Embarcadero):Deprecated, use__is_trivially_destructible
instead.__has_unique_object_representations
(C++, GNU)__has_virtual_destructor
(C++, GNU, Microsoft, Embarcadero)__is_abstract
(C++, GNU, Microsoft, Embarcadero)__is_aggregate
(C++, GNU, Microsoft)__is_arithmetic
(C++, Embarcadero)__is_array
(C++, Embarcadero)__is_assignable
(C++, MSVC 2015)__is_base_of
(C++, GNU, Microsoft, Embarcadero)__is_bounded_array
(C++, GNU, Microsoft, Embarcadero)__is_class
(C++, GNU, Microsoft, Embarcadero)__is_complete_type(type)
(Embarcadero):Returntrue
iftype
is a complete type.Warning: this trait is dangerous because it can return different values atdifferent points in the same program.__is_compound
(C++, Embarcadero)__is_const
(C++, Embarcadero)__is_constructible
(C++, MSVC 2013)__is_convertible
(C++, Embarcadero)__is_nothrow_convertible
(C++, GNU)__is_convertible_to
(Microsoft):Synonym for__is_convertible
.__is_destructible
(C++, MSVC 2013)__is_empty
(C++, GNU, Microsoft, Embarcadero)__is_enum
(C++, GNU, Microsoft, Embarcadero)__is_final
(C++, GNU, Microsoft)__is_floating_point
(C++, Embarcadero)__is_function
(C++, Embarcadero)__is_fundamental
(C++, Embarcadero)__is_integral
(C++, Embarcadero)__is_interface_class
(Microsoft):Returnsfalse
, even for types defined with__interface
.__is_layout_compatible
(C++, GNU, Microsoft)__is_literal
(Clang):Synonym for__is_literal_type
.__is_literal_type
(C++, GNU, Microsoft):Note, the corresponding standard trait was deprecated in C++17and removed in C++20.__is_lvalue_reference
(C++, Embarcadero)__is_member_object_pointer
(C++, Embarcadero)__is_member_function_pointer
(C++, Embarcadero)__is_member_pointer
(C++, Embarcadero)__is_nothrow_assignable
(C++, MSVC 2013)__is_nothrow_constructible
(C++, MSVC 2013)__is_nothrow_destructible
(C++, MSVC 2013)__is_object
(C++, Embarcadero)__is_pod
(C++, GNU, Microsoft, Embarcadero):Note, the corresponding standard trait was deprecated in C++20.__is_pointer
(C++, Embarcadero)__is_pointer_interconvertible_base_of
(C++, GNU, Microsoft)__is_polymorphic
(C++, GNU, Microsoft, Embarcadero)__is_reference
(C++, Embarcadero)__is_rvalue_reference
(C++, Embarcadero)__is_same
(C++, Embarcadero)__is_same_as
(GCC): Synonym for__is_same
.__is_scalar
(C++, Embarcadero)__is_scoped_enum
(C++, GNU, Microsoft, Embarcadero)__is_sealed
(Microsoft):Synonym for__is_final
.__is_signed
(C++, Embarcadero):Returns false for enumeration types, and returns true for floating-pointtypes. Note, before Clang 10, returned true for enumeration types if theunderlying type was signed, and returned false for floating-point types.__is_standard_layout
(C++, GNU, Microsoft, Embarcadero)__is_trivial
(C++, GNU, Microsoft, Embarcadero)__is_trivially_assignable
(C++, GNU, Microsoft)__is_trivially_constructible
(C++, GNU, Microsoft)__is_trivially_copyable
(C++, GNU, Microsoft)__is_trivially_destructible
(C++, MSVC 2013)__is_trivially_relocatable
(Clang) (Deprecated,use__builtin_is_cpp_trivially_relocatable
instead).Returns true if moving an objectof the given type, and then destroying the source object, is known to befunctionally equivalent to copying the underlying bytes and then dropping thesource object on the floor. This is true of trivial types,C++26 relocatable types, and types whichwere made trivially relocatable via theclang::trivial_abi
attribute.This trait is deprecated and should be replaced by__builtin_is_cpp_trivially_relocatable
. Note however that it is generallyunsafe to relocate a C++-relocatable type withmemcpy
ormemmove
;use__builtin_trivially_relocate
.__builtin_is_cpp_trivially_relocatable
(C++): Returns true if an objectis trivially relocatable, as defined by the C++26 standard [meta.unary.prop].Note that when relocating the caller code should ensure that if the object is polymorphic,the dynamic type is of the most derived type. Padding bytes should not be copied.__builtin_is_replaceable
(C++): Returns true if an objectis replaceable, as defined by the C++26 standard [meta.unary.prop].__is_trivially_equality_comparable
(Clang): Returns true if comparing twoobjects of the provided type is known to be equivalent to comparing theirobject representations. Note that types containing padding bytes are nevertrivially equality comparable.__is_unbounded_array
(C++, GNU, Microsoft, Embarcadero)__is_union
(C++, GNU, Microsoft, Embarcadero)__is_unsigned
(C++, Embarcadero):Returns false for enumeration types. Note, before Clang 13, returned true forenumeration types if the underlying type was unsigned.__is_void
(C++, Embarcadero)__is_volatile
(C++, Embarcadero)__reference_binds_to_temporary(T,U)
(Clang): Determines whether areference of typeT
bound to an expression of typeU
would bind to amaterialized temporary object. IfT
is not a reference type the resultis false. Note this trait will also return false when the initialization ofT
fromU
is ill-formed.Deprecated, use__reference_constructs_from_temporary
.__reference_constructs_from_temporary(T,U)
(C++)Returns true if a referenceT
can be direct-initialized from a temporary of typea non-cv-qualifiedU
.__reference_converts_from_temporary(T,U)
(C++)Returns true if a reference
T
can be copy-initialized from a temporary of typea non-cv-qualifiedU
.
__underlying_type
(C++, GNU, Microsoft)
In addition, the following expression traits are supported:
__is_lvalue_expr(e)
(Embarcadero):Returns true ife
is an lvalue expression.Deprecated, use__is_lvalue_reference(decltype((e)))
instead.__is_rvalue_expr(e)
(Embarcadero):Returns true ife
is a prvalue expression.Deprecated, use!__is_reference(decltype((e)))
instead.
There are multiple ways to detect support for a type trait__X
in thecompiler, depending on the oldest version of Clang you wish to support.
From Clang 10 onwards,
__has_builtin(__X)
can be used.From Clang 6 onwards,
!__is_identifier(__X)
can be used.From Clang 3 onwards,
__has_feature(X)
can be used, but only supportsthe following traits:__has_nothrow_assign
__has_nothrow_copy
__has_nothrow_constructor
__has_trivial_assign
__has_trivial_copy
__has_trivial_constructor
__has_trivial_destructor
__has_virtual_destructor
__is_abstract
__is_base_of
__is_class
__is_constructible
__is_convertible_to
__is_empty
__is_enum
__is_final
__is_literal
__is_standard_layout
__is_pod
__is_polymorphic
__is_sealed
__is_trivial
__is_trivially_assignable
__is_trivially_constructible
__is_trivially_copyable
__is_union
__underlying_type
A simplistic usage example as might be seen in standard C++ headers follows:
#if __has_builtin(__is_convertible_to)template<typenameFrom,typenameTo>structis_convertible_to{staticconstboolvalue=__is_convertible_to(From,To);};#else// Emulate type trait for compatibility with other compilers.#endif
__builtin_structured_binding_size (C++)¶
The__builtin_structured_binding_size(T)
type trait returnsthestructured binding size ([dcl.struct.bind]) of typeT
This is equivalent to the size of the packp
inauto&&[...p]=declval<T&>();
.If the argument cannot be decomposed,__builtin_structured_binding_size(T)
is not a valid expression (__builtin_structured_binding_size
is SFINAE-friendly).
builtin arrays, builtin SIMD vectors,builtin complex types,tuple-like types, and decomposable class typesare decomposable types.
A type is considered a validtuple-like ifstd::tuple_size_v<T>
is a valid expression,even if there is no validstd::tuple_element
specialization or suitableget
function for that type.
template<std::size_tIdx,typenameT>requires(Idx<__builtin_structured_binding_size(T))decltype(auto)constexprget_binding(T&&obj){auto&&[...p]=std::forward<T>(obj);returnp...[Idx];}structS{inta=0,b=42;};static_assert(__builtin_structured_binding_size(S)==2);static_assert(get_binding<1>(S{})==42);
Blocks¶
The syntax and high level language feature description is inBlockLanguageSpec. Implementation and ABI details forthe clang implementation are inBlock-ABI-Apple.
Query for this feature with__has_extension(blocks)
.
ASM Goto with Output Constraints¶
Outputs may be used along any branches from theasmgoto
whether thebranches are taken or not.
Query for this feature with__has_extension(gnu_asm_goto_with_outputs)
.
Prior to clang-16, the output may only be used safely when the indirectbranches are not taken. Query for this difference with__has_extension(gnu_asm_goto_with_outputs_full)
.
When using tied-outputs (i.e. outputs that are inputs and outputs, not justoutputs) with the+r constraint, there is a hidden input that’s createdbefore the label, so numeric references to operands must account for that.
intfoo(intx){// %0 and %1 both refer to x// %l2 refers to errasmgoto("# %0 %1 %l2":"+r"(x):::err);returnx;err:return-1;}
This was changed to match GCC in clang-13; for better portability, symbolicreferences can be used instead of numeric references.
intfoo(intx){asmgoto("# %[x] %l[err]":[x]"+r"(x):::err);returnx;err:return-1;}
ASM Goto versus Branch Target Enforcement¶
Some target architectures implement branch target enforcement, by requiringindirect (register-controlled) branch instructions to jump only to locationsmarked by a special instruction (such as AArch64bti
).
The assembler code inside anasmgoto
statement is expected not to use abranch instruction of that kind to transfer control to any of its destinationlabels. Therefore, using a label in anasmgoto
statement does not causeclang to put abti
or equivalent instruction at the label.
Constexpr strings in GNU ASM statements¶
In C++11 mode (and greater), Clang supports specifying the template,constraints, and clobber strings with a parenthesized constant expressionproducing an object with the following member functions
constexprconstchar*data()const;constexprsize_tsize()const;
such asstd::string
,std::string_view
,std::vector<char>
.This mechanism follow the same rules asstatic_assert
messages inC++26, see[dcl.pre]/p12
.
Query for this feature with__has_extension(gnu_asm_constexpr_strings)
.
intfoo(){asm((std::string_view("nop")):::(std::string_view("memory")));}
Objective-C Features¶
Related result types¶
According to Cocoa conventions, Objective-C methods with certain names(”init
”, “alloc
”, etc.) always return objects that are an instance ofthe receiving class’s type. Such methods are said to have a “related resulttype”, meaning that a message send to one of these methods will have the samestatic type as an instance of the receiver class. For example, given thefollowing classes:
@interfaceNSObject+(id)alloc;-(id)init;@end@interfaceNSArray :NSObject@end
and this common initialization pattern
NSArray*array=[[NSArrayalloc]init];
the type of the expression[NSArrayalloc]
isNSArray*
becausealloc
implicitly has a related result type. Similarly, the type of theexpression[[NSArrayalloc]init]
isNSArray*
, sinceinit
has arelated result type and its receiver is known to have the typeNSArray*
.If neitheralloc
norinit
had a related result type, the expressionswould have had typeid
, as declared in the method signature.
A method with a related result type can be declared by using the typeinstancetype
as its result type.instancetype
is a contextual keywordthat is only permitted in the result type of an Objective-C method, e.g.
@interfaceA+(instancetype)constructAnA;@end
The related result type can also be inferred for some methods. To determinewhether a method has an inferred related result type, the first word in thecamel-case selector (e.g., “init
” in “initWithObjects
”) is considered,and the method will have a related result type if its return type is compatiblewith the type of its class and if:
the first word is “
alloc
” or “new
”, and the method is a class method,orthe first word is “
autorelease
”, “init
”, “retain
”, or “self
”,and the method is an instance method.
If a method with a related result type is overridden by a subclass method, thesubclass method must also return a type that is compatible with the subclasstype. For example:
@interfaceNSString :NSObject-(NSUnrelated*)init;// incorrect usage: NSUnrelated is not NSString or a superclass of NSString@end
Related result types only affect the type of a message send or property accessvia the given method. In all other respects, a method with a related resulttype is treated the same way as method that returnsid
.
Use__has_feature(objc_instancetype)
to determine whether theinstancetype
contextual keyword is available.
Automatic reference counting¶
Clang provides support forautomated reference counting in Objective-C, which eliminates the needfor manualretain
/release
/autorelease
message sends. There are threefeature macros associated with automatic reference counting:__has_feature(objc_arc)
indicates the availability of automated referencecounting in general, while__has_feature(objc_arc_weak)
indicates thatautomated reference counting also includes support for__weak
pointers toObjective-C objects.__has_feature(objc_arc_fields)
indicates that C structsare allowed to have fields that are pointers to Objective-C objects managed byautomatic reference counting.
Weak references¶
Clang supports ARC-style weak and unsafe references in Objective-C evenoutside of ARC mode. Weak references must be explicitly enabled withthe-fobjc-weak
option; use__has_feature((objc_arc_weak))
to test whether they are enabled. Unsafe references are enabledunconditionally. ARC-style weak and unsafe references cannot be usedwhen Objective-C garbage collection is enabled.
Except as noted below, the language rules for the__weak
and__unsafe_unretained
qualifiers (and theweak
andunsafe_unretained
property attributes) are just as laid outin theARC specification.In particular, note that some classes do not support forming weakreferences to their instances, and note that special care must betaken when storing weak references in memory where initializationand deinitialization are outside the responsibility of the compiler(such as inmalloc
-ed memory).
Loading from a__weak
variable always implicitly retains theloaded value. In non-ARC modes, this retain is normally balancedby an implicit autorelease. This autorelease can be suppressedby performing the load in the receiver position of a-retain
message send (e.g.[weakReferenceretain]
); note that this performsonly a single retain (the retain done when primitively loading fromthe weak reference).
For the most part,__unsafe_unretained
in non-ARC modes is just thedefault behavior of variables and therefore is not needed. However,it does have an effect on the semantics of block captures: normally,copying a block which captures an Objective-C object or block pointercauses the captured pointer to be retained or copied, respectively,but that behavior is suppressed when the captured variable is qualifiedwith__unsafe_unretained
.
Note that the__weak
qualifier formerly meant the GC qualifier inall non-ARC modes and was silently ignored outside of GC modes. It nowmeans the ARC-style qualifier in all non-GC modes and is no longerallowed if not enabled by either-fobjc-arc
or-fobjc-weak
.It is expected that-fobjc-weak
will eventually be enabled by defaultin all non-GC Objective-C modes.
Enumerations with a fixed underlying type¶
Clang provides support for C++11 enumerations with a fixed underlying typewithin Objective-C and Cprior to C23. For example, one can write an enumeration type as:
typedefenum:unsignedchar{Red,Green,Blue}Color;
This specifies that the underlying type, which is used to store the enumerationvalue, isunsignedchar
.
Use__has_feature(objc_fixed_enum)
to determine whether support for fixedunderlying types is available in Objective-C.
Use__has_extension(c_fixed_enum)
to determine whether support for fixedunderlying types is available in C prior to C23. This will also reporttrue
in C23and later modes as the functionality is available even if it’s not an extension inthose modes.
Use__has_feature(c_fixed_enum)
to determine whether support for fixedunderlying types is available in C23 and later.
Interoperability with C++11 lambdas¶
Clang provides interoperability between C++11 lambdas and blocks-based APIs, bypermitting a lambda to be implicitly converted to a block pointer with thecorresponding signature. For example, consider an API such asNSArray
’sarray-sorting method:
-(NSArray*)sortedArrayUsingComparator:(NSComparator)cmptr;
NSComparator
is simply a typedef for the block pointerNSComparisonResult(^)(id,id)
, and parameters of this type are generally provided with blockliterals as arguments. However, one can also use a C++11 lambda so long as itprovides the same signature (in this case, accepting two parameters of typeid
and returning anNSComparisonResult
):
NSArray*array=@[@"string 1",@"string 21",@"string 12",@"String 11",@"String 02"];constNSStringCompareOptionscomparisonOptions=NSCaseInsensitiveSearch|NSNumericSearch|NSWidthInsensitiveSearch|NSForcedOrderingSearch;NSLocale*currentLocale=[NSLocalecurrentLocale];NSArray*sorted=[arraysortedArrayUsingComparator:[=](ids1,ids2)->NSComparisonResult{NSRangestring1Range=NSMakeRange(0,[s1length]);return[s1compare:s2options:comparisonOptionsrange:string1Rangelocale:currentLocale];}];NSLog(@"sorted: %@",sorted);
This code relies on an implicit conversion from the type of the lambdaexpression (an unnamed, local class type called theclosure type) to thecorresponding block pointer type. The conversion itself is expressed by aconversion operator in that closure type that produces a block pointer with thesame signature as the lambda itself, e.g.,
operatorNSComparisonResult(^)(id,id)()const;
This conversion function returns a new block that simply forwards the twoparameters to the lambda object (which it captures by copy), then returns theresult. The returned block is first copied (withBlock_copy
) and thenautoreleased. As an optimization, if a lambda expression is immediatelyconverted to a block pointer (as in the first example, above), then the blockis not copied and autoreleased: rather, it is given the same lifetime as ablock literal written at that point in the program, which avoids the overheadof copying a block to the heap in the common case.
The conversion from a lambda to a block pointer is only available inObjective-C++, and not in C++ with blocks, due to its use of Objective-C memorymanagement (autorelease).
Object Literals and Subscripting¶
Clang provides support forObject Literals and Subscripting in Objective-C, which simplifies common Objective-Cprogramming patterns, makes programs more concise, and improves the safety ofcontainer creation. There are several feature macros associated with objectliterals and subscripting:__has_feature(objc_array_literals)
tests theavailability of array literals;__has_feature(objc_dictionary_literals)
tests the availability of dictionary literals;__has_feature(objc_subscripting)
tests the availability of objectsubscripting.
Objective-C Autosynthesis of Properties¶
Clang provides support for autosynthesis of declared properties. Using thisfeature, clang provides default synthesis of those properties not declared@dynamic and not having user provided backing getter and setter methods.__has_feature(objc_default_synthesize_properties)
checks for availabilityof this feature in version of clang being used.
Objective-C retaining behavior attributes¶
In Objective-C, functions and methods are generally assumed to follow theCocoa Memory Managementconventions for ownership of object arguments andreturn values. However, there are exceptions, and so Clang provides attributesto allow these exceptions to be documented. This are used by ARC and thestatic analyzer Some exceptions may bebetter described using theobjc_method_family
attribute instead.
Usage: Thens_returns_retained
,ns_returns_not_retained
,ns_returns_autoreleased
,cf_returns_retained
, andcf_returns_not_retained
attributes can be placed on methods and functionsthat return Objective-C or CoreFoundation objects. They are commonly placed atthe end of a function prototype or method declaration:
idfoo()__attribute__((ns_returns_retained));-(NSString*)bar:(int)x__attribute__((ns_returns_retained));
The*_returns_retained
attributes specify that the returned object has a +1retain count. The*_returns_not_retained
attributes specify that the returnobject has a +0 retain count, even if the normal convention for its selectorwould be +1.ns_returns_autoreleased
specifies that the returned object is+0, but is guaranteed to live at least as long as the next flush of anautorelease pool.
Usage: Thens_consumed
andcf_consumed
attributes can be placed ona parameter declaration; they specify that the argument is expected to have a+1 retain count, which will be balanced in some way by the function or method.Thens_consumes_self
attribute can only be placed on an Objective-Cmethod; it specifies that the method expects itsself
parameter to have a+1 retain count, which it will balance in some way.
voidfoo(__attribute__((ns_consumed))NSString*string);-(void)bar__attribute__((ns_consumes_self));-(void)baz:(id)__attribute__((ns_consumed))x;
Further examples of these attributes are available in the static analyzer’slist of annotations for analysis.
Query for these features with__has_attribute(ns_consumed)
,__has_attribute(ns_returns_retained)
, etc.
Objective-C @available¶
It is possible to use the newest SDK but still build a program that can run onolder versions of macOS and iOS by passing-mmacos-version-min=
/-miphoneos-version-min=
.
Before LLVM 5.0, when calling a function that exists only in the OS that’snewer than the target OS (as determined by the minimum deployment version),programmers had to carefully check if the function exists at runtime, usingnull checks for weakly-linked C functions,+class
for Objective-C classes,and-respondsToSelector:
or+instancesRespondToSelector:
forObjective-C methods. If such a check was missed, the program would compilefine, run fine on newer systems, but crash on older systems.
As of LLVM 5.0,-Wunguarded-availability
uses theavailability attributes togetherwith the new@available()
keyword to assist with this issue.When a method that’s introduced in the OS newer than the target OS is called, a-Wunguarded-availability warning is emitted if that call is not guarded:
voidmy_fun(NSSomeClass*var){// If fancyNewMethod was added in e.g. macOS 10.12, but the code is// built with -mmacos-version-min=10.11, then this unconditional call// will emit a -Wunguarded-availability warning:[varfancyNewMethod];}
To fix the warning and to avoid the crash on macOS 10.11, wrap it inif(@available())
:
voidmy_fun(NSSomeClass*var){if(@available(macOS10.12,*)){[varfancyNewMethod];}else{// Put fallback behavior for old macOS versions (and for non-mac// platforms) here.}}
The*
is required and means that platforms not explicitly listed will takethe true branch, and the compiler will emit-Wunguarded-availability
warnings for unlisted platforms based on those platform’s deployment target.More than one platform can be listed in@available()
:
voidmy_fun(NSSomeClass*var){if(@available(macOS10.12,iOS10,*)){[varfancyNewMethod];}}
If the caller ofmy_fun()
already checks thatmy_fun()
is only calledon 10.12, then add anavailability attribute to it,which will also suppress the warning and require that calls to my_fun() arechecked:
API_AVAILABLE(macos(10.12))voidmy_fun(NSSomeClass*var){[varfancyNewMethod];// Now ok.}
@available()
is only available in Objective-C code. To use the featurein C and C++ code, use the__builtin_available()
spelling instead.
If existing code uses null checks or-respondsToSelector:
, it shouldbe changed to use@available()
(or__builtin_available
) instead.
-Wunguarded-availability
is disabled by default, but-Wunguarded-availability-new
, which only emits this warning for APIsthat have been introduced in macOS >= 10.13, iOS >= 11, watchOS >= 4 andtvOS >= 11, is enabled by default.
Objective-C++ ABI: protocol-qualifier mangling of parameters¶
Starting with LLVM 3.4, Clang produces a new mangling for parameters whosetype is a qualified-id
(e.g.,id<Foo>
). This mangling allows suchparameters to be differentiated from those with the regular unqualifiedid
type.
This was a non-backward compatible mangling change to the ABI. This changeallows proper overloading, and also prevents mangling conflicts with templateparameters of protocol-qualified type.
Query the presence of this new mangling with__has_feature(objc_protocol_qualifier_mangling)
.
Initializer lists for complex numbers in C¶
clang supports an extension which allows the following in C:
#include<math.h>#include<complex.h>complexfloatx={1.0f,INFINITY};// Init to (1, Inf)
This construct is useful because there is no way to separately initialize thereal and imaginary parts of a complex variable in standard C, given that clangdoes not support_Imaginary
. (Clang also supports the__real__
and__imag__
extensions from gcc, which help in some cases, but are not usablein static initializers.)
Note that this extension does not allow eliding the braces; the meaning of thefollowing two lines is different:
complexfloatx[]={{1.0f,1.0f}};// [0] = (1, 1)complexfloatx[]={1.0f,1.0f};// [0] = (1, 0), [1] = (1, 0)
This extension also works in C++ mode, as far as that goes, but does not applyto the C++std::complex
. (In C++11, list initialization allows the samesyntax to be used withstd::complex
with the same meaning.)
For GCC compatibility,__builtin_complex(re,im)
can also be used toconstruct a complex number from the given real and imaginary components.
OpenCL Features¶
Clang supports internal OpenCL extensions documented below.
__cl_clang_bitfields
¶
With this extension it is possible to enable bitfields in structsor unions using the OpenCL extension pragma mechanism detailed inthe OpenCL Extension Specification, section 1.2.
Use of bitfields in OpenCL kernels can result in reduced portability as structlayout is not guaranteed to be consistent when compiled by different compilers.If structs with bitfields are used as kernel function parameters, it can resultin incorrect functionality when the layout is different between the host anddevice code.
Example of Use:
#pragma OPENCL EXTENSION __cl_clang_bitfields : enablestructwith_bitfield{unsignedinti:5;// compiled - no diagnostic generated};#pragma OPENCL EXTENSION __cl_clang_bitfields : disablestructwithout_bitfield{unsignedinti:5;// error - bitfields are not supported};
__cl_clang_function_pointers
¶
With this extension it is possible to enable various language features thatare relying on function pointers using regular OpenCL extension pragmamechanism detailed inthe OpenCL Extension Specification,section 1.2.
In C++ for OpenCL this also enables:
Use of member function pointers;
Unrestricted use of references to functions;
Virtual member functions.
Such functionality is not conformant and does not guarantee to compilecorrectly in any circumstances. It can be used if:
the kernel source does not contain call expressions to (member-) functionpointers, or virtual functions. For example this extension can be used inmetaprogramming algorithms to be able to specify/detect types generically.
the generated kernel binary does not contain indirect calls because theyare eliminated using compiler optimizations e.g. devirtualization.
the selected target supports the function pointer like functionality e.g.most CPU targets.
Example of Use:
#pragma OPENCL EXTENSION __cl_clang_function_pointers : enablevoidfoo(){void(*fp)();// compiled - no diagnostic generated}#pragma OPENCL EXTENSION __cl_clang_function_pointers : disablevoidbar(){void(*fp)();// error - pointers to function are not allowed}
__cl_clang_variadic_functions
¶
With this extension it is possible to enable variadic arguments in functionsusing regular OpenCL extension pragma mechanism detailed inthe OpenCLExtension Specification, section 1.2.
This is not conformant behavior and it can only be used portably when thefunctions with variadic prototypes do not get generated in binary e.g. thevariadic prototype is used to specify a function type with any number ofarguments in metaprogramming algorithms in C++ for OpenCL.
This extensions can also be used when the kernel code is intended for targetssupporting the variadic arguments e.g. majority of CPU targets.
Example of Use:
#pragma OPENCL EXTENSION __cl_clang_variadic_functions : enablevoidfoo(inta,...);// compiled - no diagnostic generated#pragma OPENCL EXTENSION __cl_clang_variadic_functions : disablevoidbar(inta,...);// error - variadic prototype is not allowed
__cl_clang_non_portable_kernel_param_types
¶
With this extension it is possible to enable the use of some restricted typesin kernel parameters specified inC++ for OpenCL v1.0 s2.4.The restrictions can be relaxed using regular OpenCL extension pragma mechanismdetailed inthe OpenCL Extension Specification, section 1.2.
This is not a conformant behavior and it can only be used when thekernel arguments are not accessed on the host side or the data layout/sizebetween the host and device is known to be compatible.
Example of Use:
// Plain Old Data type.structPod{inta;intb;};// Not POD type because of the constructor.// Standard layout type because there is only one access control.structOnlySL{inta;intb;OnlySL():a(0),b(0){}};// Not standard layout type because of two different access controls.structNotSL{inta;private:intb;};#pragma OPENCL EXTENSION __cl_clang_non_portable_kernel_param_types : enablekernelvoidkernel_main(Poda,OnlySLb,globalNotSL*c,globalOnlySL*d);#pragma OPENCL EXTENSION __cl_clang_non_portable_kernel_param_types : disable
Remove address space builtin function¶
__remove_address_space
allows to derive types in C++ for OpenCLthat have address space qualifiers removed. This utility only affectsaddress space qualifiers, therefore, other type qualifiers such asconst
orvolatile
remain unchanged.
Example of Use:
template<typenameT>voidfoo(T*par){Tvar1;// error - local function variable with global address space__privateTvar2;// error - conflicting address space qualifiers__private__remove_address_space<T>::typevar3;// var3 is __private int}voidbar(){__globalint*ptr;foo(ptr);}
Legacy 1.x atomics with generic address space¶
Clang allows use of atomic functions from the OpenCL 1.x standardswith the generic address space pointer in C++ for OpenCL mode.
This is a non-portable feature and might not be supported by alltargets.
Example of Use:
voidfoo(__genericvolatileunsignedint*a){atomic_add(a,1);}
WebAssembly Features¶
Clang supports the WebAssembly features documented below. For furtherinformation related to the semantics of the builtins, please refer to theWebAssembly Specification.In this section, when we refer to reference types, we are referring toWebAssembly reference types, not C++ reference types unless statedotherwise.
__builtin_wasm_table_set
¶
This builtin function stores a value in a WebAssembly table.It takes three arguments.The first argument is the table to store a value into, the secondargument is the index to which to store the value into, and thethird argument is a value of reference type to store in the table.It returns nothing.
static__externref_ttable[0];extern__externref_tJSObj;voidstore(intindex){__builtin_wasm_table_set(table,index,JSObj);}
__builtin_wasm_table_get
¶
This builtin function is the counterpart to__builtin_wasm_table_set
and loads a value from a WebAssembly table of reference typed values.It takes 2 arguments.The first argument is a table of reference typed values and thesecond argument is an index from which to load the value. It returnsthe loaded reference typed value.
static__externref_ttable[0];__externref_tload(intindex){__externref_tObj=__builtin_wasm_table_get(table,index);returnObj;}
__builtin_wasm_table_size
¶
This builtin function returns the size of the WebAssembly table.Takes the table as an argument and returns an unsigned integer (size_t
)with the current table size.
typedefvoid(*__funcreffuncref_t)();staticfuncref_ttable[0];size_tgetSize(){return__builtin_wasm_table_size(table);}
__builtin_wasm_table_grow
¶
This builtin function grows the WebAssembly table by a certain amount.Currently, as all WebAssembly tables created in C/C++ are zero-sized,this always needs to be called to grow the table.
It takes three arguments. The first argument is the WebAssembly tableto grow. The second argument is the reference typed value to store inthe new table entries (the initialization value), and the third argumentis the amount to grow the table by. It returns the previous table sizeor -1. It will return -1 if not enough space could be allocated.
typedefvoid(*__funcreffuncref_t)();staticfuncref_ttable[0];// grow returns the new table size or -1 on error.intgrow(funcref_tfn,intdelta){intprevSize=__builtin_wasm_table_grow(table,fn,delta);if(prevSize==-1)return-1;returnprevSize+delta;}
__builtin_wasm_table_fill
¶
This builtin function sets all the entries of a WebAssembly table to a givenreference typed value. It takes four arguments. The first argument isthe WebAssembly table, the second argument is the index that starts therange, the third argument is the value to set in the new entries, andthe fourth and the last argument is the size of the range. It returnsnothing.
static__externref_ttable[0];// resets a table by setting all of its entries to a given value.voidreset(__externref_tObj){intSize=__builtin_wasm_table_size(table);__builtin_wasm_table_fill(table,0,Obj,Size);}
__builtin_wasm_table_copy
¶
This builtin function copies elements from a source WebAssembly tableto a possibly overlapping destination region. It takes five arguments.The first argument is the destination WebAssembly table, and the secondargument is the source WebAssembly table. The third argument is thedestination index from where the copy starts, the fourth argument is thesource index from there the copy starts, and the fifth and last argumentis the number of elements to copy. It returns nothing.
static__externref_ttableSrc[0];static__externref_ttableDst[0];// Copy nelem elements from [src, src + nelem - 1] in tableSrc to// [dst, dst + nelem - 1] in tableDstvoidcopy(intdst,intsrc,intnelem){__builtin_wasm_table_copy(tableDst,tableSrc,dst,src,nelem);}
Builtin Functions¶
Clang supports a number of builtin library functions with the same syntax asGCC, including things like__builtin_nan
,__builtin_constant_p
,__builtin_choose_expr
,__builtin_types_compatible_p
,__builtin_assume_aligned
,__sync_fetch_and_add
, etc. In addition tothe GCC builtins, Clang supports a number of builtins that GCC does not, whichare listed here.
Please note that Clang does not and will not support all of the GCC builtinsfor vector operations. Instead of using builtins, you should use the functionsdefined in target-specific header files like<xmmintrin.h>
, which defineportable wrappers for these. Many of the Clang versions of these functions areimplemented directly in terms ofextended vector support instead of builtins, in order to reduce the number ofbuiltins that we need to implement.
__builtin_alloca
¶
__builtin_alloca
is used to dynamically allocate memory on the stack. Memoryis automatically freed upon function termination.
Syntax:
__builtin_alloca(size_tn)
Example of Use:
voidinit(float*data,size_tnbelems);voidprocess(float*data,size_tnbelems);intfoo(size_tn){automem=(float*)__builtin_alloca(n*sizeof(float));init(mem,n);process(mem,n);/* mem is automatically freed at this point */}
Description:
__builtin_alloca
is meant to be used to allocate a dynamic amount of memoryon the stack. This amount is subject to stack allocation limits.
Query for this feature with__has_builtin(__builtin_alloca)
.
__builtin_alloca_with_align
¶
__builtin_alloca_with_align
is used to dynamically allocate memory on thestack while controlling its alignment. Memory is automatically freed uponfunction termination.
Syntax:
__builtin_alloca_with_align(size_tn,size_talign)
Example of Use:
voidinit(float*data,size_tnbelems);voidprocess(float*data,size_tnbelems);intfoo(size_tn){automem=(float*)__builtin_alloca_with_align(n*sizeof(float),CHAR_BIT*alignof(float));init(mem,n);process(mem,n);/* mem is automatically freed at this point */}
Description:
__builtin_alloca_with_align
is meant to be used to allocate a dynamic amount of memoryon the stack. It is similar to__builtin_alloca
but accepts a secondargument whose value is the alignment constraint, as a power of 2 inbits.
Query for this feature with__has_builtin(__builtin_alloca_with_align)
.
__builtin_assume
¶
__builtin_assume
is used to provide the optimizer with a booleaninvariant that is defined to be true.
Syntax:
__builtin_assume(bool)
Example of Use:
intfoo(intx){__builtin_assume(x!=0);// The optimizer may short-circuit this check using the invariant.if(x==0)returndo_something();returndo_something_else();}
Description:
The boolean argument to this function is defined to be true. The optimizer mayanalyze the form of the expression provided as the argument and deduce fromthat information used to optimize the program. If the condition is violatedduring execution, the behavior is undefined. The argument itself is neverevaluated, so any side effects of the expression will be discarded.
Query for this feature with__has_builtin(__builtin_assume)
.
__builtin_assume_separate_storage
¶
__builtin_assume_separate_storage
is used to provide the optimizer with theknowledge that its two arguments point to separately allocated objects.
Syntax:
__builtin_assume_separate_storage(constvolatilevoid*,constvolatilevoid*)
Example of Use:
intfoo(int*x,int*y){__builtin_assume_separate_storage(x,y);*x=0;*y=1;// The optimizer may optimize this to return 0 without reloading from *x.return*x;}
Description:
The arguments to this function are assumed to point into separately allocatedstorage (either different variable definitions or different dynamic storageallocations). The optimizer may use this fact to aid in alias analysis. If thearguments point into the same storage, the behavior is undefined. Note that thedefinition of “storage” here refers to the outermost enclosing allocation of anyparticular object (so for example, it’s never correct to call this functionpassing the addresses of fields in the same struct, elements of the same array,etc.).
Query for this feature with__has_builtin(__builtin_assume_separate_storage)
.
__builtin_assume_dereferenceable
¶
__builtin_assume_dereferenceable
is used to provide the optimizer with theknowledge that the pointer argument P is dereferenceable up to at least thespecified number of bytes.
Syntax:
__builtin_assume_dereferenceable(constvoid*,size_t)
Example of Use:
intfoo(int*x,inty){__builtin_assume_dereferenceable(x,sizeof(int));intz=0;if(y==1){// The optimizer may execute the load of x unconditionally due to// __builtin_assume_dereferenceable guaranteeing sizeof(int) bytes can// be loaded speculatively without trapping.z=*x;}returnz;}
Description:
The arguments to this function provide a start pointerP
and a sizeS
.S
must be at least 1 and a constant. The optimizer may assume thatS
bytes are dereferenceable starting atP
. Note that this does not necessarilyimply thatP
is non-null asnullptr
can be dereferenced in some cases.The assumption also does not imply thatP
is not dereferenceable pastS
bytes.
Query for this feature with__has_builtin(__builtin_assume_dereferenceable)
.
__builtin_offsetof
¶
__builtin_offsetof
is used to implement theoffsetof
macro, whichcalculates the offset (in bytes) to a given member of the given type.
Syntax:
__builtin_offsetof(type-name,member-designator)
Example of Use:
structS{charc;inti;structT{floatf[2];}t;};constintoffset_to_i=__builtin_offsetof(structS,i);constintext1=__builtin_offsetof(structU{inti;},i);// C extensionconstintoffset_to_subobject=__builtin_offsetof(structS,t.f[1]);
Description:
This builtin is usable in an integer constant expression which returns a valueof typesize_t
. The value returned is the offset in bytes to the subobjectdesignated by the member-designator from the beginning of an object of typetype-name
. Clang extends the required standard functionality in thefollowing way:
In C language modes, the first argument may be the definition of a new type.Any type declared this way is scoped to the nearest scope containing the callto the builtin.
Query for this feature with__has_builtin(__builtin_offsetof)
.
__builtin_get_vtable_pointer
¶
__builtin_get_vtable_pointer
loads and authenticates the primary vtablepointer from an instance of a polymorphic C++ class. This builtin is neededfor directly loading the vtable pointer when on platforms usingPointer Authentication.
Syntax:
__builtin_get_vtable_pointer(PolymorphicClass*)
Example of Use:
structPolymorphicClass{virtual~PolymorphicClass();};PolymorphicClassanInstance;constvoid*vtablePointer=__builtin_get_vtable_pointer(&anInstance);
Description:
The__builtin_get_vtable_pointer
builtin loads the primary vtablepointer from a polymorphic C++ type. If the target platform authenticatesvtable pointers, this builtin will perform the authentication and producethe underlying raw pointer. The object being queried must be polymorphic,and so must also be a complete type.
Query for this feature with__has_builtin(__builtin_get_vtable_pointer)
.
__builtin_call_with_static_chain
¶
__builtin_call_with_static_chain
is used to perform a static call whilesetting updating the static chain register.
Syntax:
T__builtin_call_with_static_chain(Texpr,void*ptr)
Example of Use:
autov=__builtin_call_with_static_chain(foo(3),foo);
Description:
This builtin returnsexpr
after checking thatexpr
is a non-memberstatic call expression. The call to that expression is made while usingptr
as a function pointer stored in a dedicated register to implementstatic chaincalling convention, as used by some language to implement closures or nestedfunctions.
Query for this feature with__has_builtin(__builtin_call_with_static_chain)
.
__builtin_readcyclecounter
¶
__builtin_readcyclecounter
is used to access the cycle counter register (ora similar low-latency, high-accuracy clock) on those targets that support it.
Syntax:
__builtin_readcyclecounter()
Example of Use:
unsignedlonglongt0=__builtin_readcyclecounter();do_something();unsignedlonglongt1=__builtin_readcyclecounter();unsignedlonglongcycles_to_do_something=t1-t0;// assuming no overflow
Description:
The__builtin_readcyclecounter()
builtin returns the cycle counter value,which may be either global or process/thread-specific depending on the target.As the backing counters often overflow quickly (on the order of seconds) thisshould only be used for timing small intervals. When not supported by thetarget, the return value is always zero. This builtin takes no arguments andproduces an unsigned long long result.
Query for this feature with__has_builtin(__builtin_readcyclecounter)
. Notethat even if present, its use may depend on run-time privilege or other OScontrolled state.
__builtin_readsteadycounter
¶
__builtin_readsteadycounter
is used to access the fixed frequency counterregister (or a similar steady-rate clock) on those targets that support it.The function is similar to__builtin_readcyclecounter
above except that thefrequency is fixed, making it suitable for measuring elapsed time.
Syntax:
__builtin_readsteadycounter()
Example of Use:
unsignedlonglongt0=__builtin_readsteadycounter();do_something();unsignedlonglongt1=__builtin_readsteadycounter();unsignedlonglongsecs_to_do_something=(t1-t0)/tick_rate;
Description:
The__builtin_readsteadycounter()
builtin returns the frequency counter value.When not supported by the target, the return value is always zero. This builtintakes no arguments and produces an unsigned long long result. The builtin doesnot guarantee any particular frequency, only that it is stable. Knowledge of thecounter’s true frequency will need to be provided by the user.
Query for this feature with__has_builtin(__builtin_readsteadycounter)
.
__builtin_cpu_supports
¶
Syntax:
int__builtin_cpu_supports(constchar*features);
Example of Use::
if(__builtin_cpu_supports("sve"))sve_code();
Description:
The__builtin_cpu_supports
function detects if the run-time CPU supportsfeatures specified in string argument. It returns a positive integer if allfeatures are supported and 0 otherwise. Feature names are target specific. OnAArch64 features are combined using+
like this__builtin_cpu_supports("flagm+sha3+lse+rcpc2+fcma+memtag+bti+sme2")
.If a feature name is not supported, Clang will issue a warning and replacebuiltin by the constant 0.
Query for this feature with__has_builtin(__builtin_cpu_supports)
.
__builtin_dump_struct
¶
Syntax:
__builtin_dump_struct(&some_struct,some_printf_func,args...);
Examples:
structS{intx,y;floatf;structT{inti;}t;};voidfunc(structS*s){__builtin_dump_struct(s,printf);}
Example output:
struct S { int x = 100 int y = 42 float f = 3.141593 struct T t = { int i = 1997 }}
#include<string>structT{inta,b;};constexprvoidconstexpr_sprintf(std::string&out,constchar*format,auto...args){// ...}constexprstd::stringdump_struct(auto&x){std::strings;__builtin_dump_struct(&x,constexpr_sprintf,s);returns;}static_assert(dump_struct(T{1,2})==R"(struct T { int a = 1 int b = 2})");
Description:
The__builtin_dump_struct
function is used to print the fields of a simplestructure and their values for debugging purposes. The first argument of thebuiltin should be a pointer to a complete record type to dump. The second argumentf
should be some callable expression, and can be a function object or an overloadset. The builtin callsf
, passing any further argumentsargs...
followed by aprintf
-compatible format string and the correspondingarguments.f
may be called more than once, andf
andargs
will beevaluated once per call. In C++,f
may be a template or overload set andresolve to different functions for each call.
In the format string, a suitable format specifier will be used for builtintypes that Clang knows how to format. This includes standard builtin types, aswell as aggregate structures,void*
(printed with%p
), andconstchar*
(printed with%s
). A*%p
specifier will be used for a fieldthat Clang doesn’t know how to format, and the corresponding argument will be apointer to the field. This allows a C++ templated formatting function to detectthis case and implement custom formatting. A*
will otherwise not precede aformat specifier.
This builtin does not return a value.
This builtin can be used in constant expressions.
Query for this feature with__has_builtin(__builtin_dump_struct)
__builtin_shufflevector
¶
__builtin_shufflevector
is used to express generic vectorpermutation/shuffle/swizzle operations. This builtin is also very importantfor the implementation of various target-specific header files like<xmmintrin.h>
. This builtin can be used within constant expressions.
Syntax:
__builtin_shufflevector(vec1,vec2,index1,index2,...)
Examples:
// identity operation - return 4-element vector v1.__builtin_shufflevector(v1,v1,0,1,2,3)// "Splat" element 0 of V1 into a 4-element result.__builtin_shufflevector(V1,V1,0,0,0,0)// Reverse 4-element vector V1.__builtin_shufflevector(V1,V1,3,2,1,0)// Concatenate every other element of 4-element vectors V1 and V2.__builtin_shufflevector(V1,V2,0,2,4,6)// Concatenate every other element of 8-element vectors V1 and V2.__builtin_shufflevector(V1,V2,0,2,4,6,8,10,12,14)// Shuffle v1 with some elements being undefined. Not allowed in constexpr.__builtin_shufflevector(v1,v1,3,-1,1,-1)
Description:
The first two arguments to__builtin_shufflevector
are vectors that havethe same element type. The remaining arguments are a list of integers thatspecify the elements indices of the first two vectors that should be extractedand returned in a new vector. These element indices are numbered sequentiallystarting with the first vector, continuing into the second vector. Thus, ifvec1
is a 4-element vector, index 5 would refer to the second element ofvec2
. An index of -1 can be used to indicate that the corresponding elementin the returned vector is a don’t care and can be optimized by the backend.Values of -1 are not supported in constant expressions.
The result of__builtin_shufflevector
is a vector with the same elementtype asvec1
/vec2
but that has an element count equal to the number ofindices specified.
Query for this feature with__has_builtin(__builtin_shufflevector)
.
__builtin_convertvector
¶
__builtin_convertvector
is used to express generic vectortype-conversion operations. The input vector and the output vectortype must have the same number of elements. This builtin can be used withinconstant expressions.
Syntax:
__builtin_convertvector(src_vec,dst_vec_type)
Examples:
typedefdoublevector4double__attribute__((__vector_size__(32)));typedeffloatvector4float__attribute__((__vector_size__(16)));typedefshortvector4short__attribute__((__vector_size__(8)));vector4floatvf;vector4shortvs;// convert from a vector of 4 floats to a vector of 4 doubles.__builtin_convertvector(vf,vector4double)// equivalent to:(vector4double){(double)vf[0],(double)vf[1],(double)vf[2],(double)vf[3]}// convert from a vector of 4 shorts to a vector of 4 floats.__builtin_convertvector(vs,vector4float)// equivalent to:(vector4float){(float)vs[0],(float)vs[1],(float)vs[2],(float)vs[3]}
Description:
The first argument to__builtin_convertvector
is a vector, and the secondargument is a vector type with the same number of elements as the firstargument.
The result of__builtin_convertvector
is a vector with the same elementtype as the second argument, with a value defined in terms of the action of aC-style cast applied to each element of the first argument.
Query for this feature with__has_builtin(__builtin_convertvector)
.
__builtin_bitreverse
¶
__builtin_bitreverse8
__builtin_bitreverse16
__builtin_bitreverse32
__builtin_bitreverse64
Syntax:
__builtin_bitreverse32(x)
Examples:
uint8_trev_x=__builtin_bitreverse8(x);uint16_trev_x=__builtin_bitreverse16(x);uint32_trev_y=__builtin_bitreverse32(y);uint64_trev_z=__builtin_bitreverse64(z);
Description:
The ‘__builtin_bitreverse
’ family of builtins is used to reversethe bitpattern of an integer value; for example0b10110110
becomes0b01101101
. These builtins can be used within constant expressions.
__builtin_rotateleft
¶
__builtin_rotateleft8
__builtin_rotateleft16
__builtin_rotateleft32
__builtin_rotateleft64
Syntax:
__builtin_rotateleft32(x,y)
Examples:
uint8_trot_x=__builtin_rotateleft8(x,y);uint16_trot_x=__builtin_rotateleft16(x,y);uint32_trot_x=__builtin_rotateleft32(x,y);uint64_trot_x=__builtin_rotateleft64(x,y);
Description:
The ‘__builtin_rotateleft
’ family of builtins is used to rotatethe bits in the first argument by the amount in the second argument.For example,0b10000110
rotated left by 11 becomes0b00110100
.The shift value is treated as an unsigned amount modulo the size ofthe arguments. Both arguments and the result have the bitwidth specifiedby the name of the builtin. These builtins can be used within constantexpressions.
__builtin_rotateright
¶
__builtin_rotateright8
__builtin_rotateright16
__builtin_rotateright32
__builtin_rotateright64
Syntax:
__builtin_rotateright32(x,y)
Examples:
uint8_trot_x=__builtin_rotateright8(x,y);uint16_trot_x=__builtin_rotateright16(x,y);uint32_trot_x=__builtin_rotateright32(x,y);uint64_trot_x=__builtin_rotateright64(x,y);
Description:
The ‘__builtin_rotateright
’ family of builtins is used to rotatethe bits in the first argument by the amount in the second argument.For example,0b10000110
rotated right by 3 becomes0b11010000
.The shift value is treated as an unsigned amount modulo the size ofthe arguments. Both arguments and the result have the bitwidth specifiedby the name of the builtin. These builtins can be used within constantexpressions.
__builtin_unreachable
¶
__builtin_unreachable
is used to indicate that a specific point in theprogram cannot be reached, even if the compiler might otherwise think it can.This is useful to improve optimization and eliminates certain warnings. Forexample, without the__builtin_unreachable
in the example below, thecompiler assumes that the inline asm can fall through and prints a “functiondeclared ‘noreturn
’ should not return” warning.
Syntax:
__builtin_unreachable()
Example of use:
voidmyabort(void)__attribute__((noreturn));voidmyabort(void){asm("int3");__builtin_unreachable();}
Description:
The__builtin_unreachable()
builtin has completely undefined behavior.Since it has undefined behavior, it is a statement that it is never reached andthe optimizer can take advantage of this to produce better code. This builtintakes no arguments and produces a void result.
Query for this feature with__has_builtin(__builtin_unreachable)
.
__builtin_unpredictable
¶
__builtin_unpredictable
is used to indicate that a branch condition isunpredictable by hardware mechanisms such as branch prediction logic.
Syntax:
__builtin_unpredictable(longlong)
Example of use:
if(__builtin_unpredictable(x>0)){foo();}
Description:
The__builtin_unpredictable()
builtin is expected to be used with controlflow conditions such as inif
andswitch
statements.
Query for this feature with__has_builtin(__builtin_unpredictable)
.
__builtin_expect
¶
__builtin_expect
is used to indicate that the value of an expression isanticipated to be the same as a statically known result.
Syntax:
long__builtin_expect(longexpr,longval)
Example of use:
if(__builtin_expect(x,0)){bar();}
Description:
The__builtin_expect()
builtin is typically used with control flowconditions such as inif
andswitch
statements to help branchprediction. It means that its first argumentexpr
is expected to take thevalue of its second argumentval
. It always returnsexpr
.
Query for this feature with__has_builtin(__builtin_expect)
.
__builtin_expect_with_probability
¶
__builtin_expect_with_probability
is similar to__builtin_expect
but ittakes a probability as third argument.
Syntax:
long__builtin_expect_with_probability(longexpr,longval,doublep)
Example of use:
if(__builtin_expect_with_probability(x,0,.3)){bar();}
Description:
The__builtin_expect_with_probability()
builtin is typically used withcontrol flow conditions such as inif
andswitch
statements to helpbranch prediction. It means that its first argumentexpr
is expected to takethe value of its second argumentval
with probabilityp
.p
must bewithin[0.0;1.0]
bounds. This builtin always returns the value ofexpr
.
Query for this feature with__has_builtin(__builtin_expect_with_probability)
.
__builtin_prefetch
¶
__builtin_prefetch
is used to communicate with the cache handler to bringdata into the cache before it gets used.
Syntax:
void__builtin_prefetch(constvoid*addr,intrw=0,intlocality=3)
Example of use:
__builtin_prefetch(a+i);
Description:
The__builtin_prefetch(addr,rw,locality)
builtin is expected to be used toavoid cache misses when the developer has a good understanding of which dataare going to be used next.addr
is the address that needs to be brought intothe cache.rw
indicates the expected access mode:0
forread and1
forwrite. In case ofread write access,1
is to be used.locality
indicates the expected persistence of data in cache, from0
which means thatdata can be discarded from cache after its next use to3
which means thatdata is going to be reused a lot once in cache.1
and2
provideintermediate behavior between these two extremes.
Query for this feature with__has_builtin(__builtin_prefetch)
.
__sync_swap
¶
__sync_swap
is used to atomically swap integers or pointers in memory.
Syntax:
type__sync_swap(type*ptr,typevalue,...)
Example of Use:
intold_value=__sync_swap(&value,new_value);
Description:
The__sync_swap()
builtin extends the existing__sync_*()
family ofatomic intrinsics to allow code to atomically swap the current value with thenew value. More importantly, it helps developers write more efficient andcorrect code by avoiding expensive loops around__sync_bool_compare_and_swap()
or relying on the platform specificimplementation details of__sync_lock_test_and_set()
. The__sync_swap()
builtin is a full barrier.
__builtin_addressof
¶
__builtin_addressof
performs the functionality of the built-in&
operator, ignoring anyoperator&
overload. This is useful in constantexpressions in C++11, where there is no other way to take the address of anobject that overloadsoperator&
. Clang automatically adds[[clang::lifetimebound]]
to the parameter of__builtin_addressof
.
Example of use:
template<typenameT>constexprT*addressof(T&value){return__builtin_addressof(value);}
__builtin_function_start
¶
__builtin_function_start
returns the address of a function body.
Syntax:
void*__builtin_function_start(function)
Example of use:
voida(){}void*p=__builtin_function_start(a);classA{public:voida(intn);voida();};voidA::a(intn){}voidA::a(){}void*pa1=__builtin_function_start((void(A::*)(int))&A::a);void*pa2=__builtin_function_start((void(A::*)())&A::a);
Description:
The__builtin_function_start
builtin accepts an argument that can beconstant-evaluated to a function, and returns the address of the functionbody. This builtin is not supported on all targets.
The returned pointer may differ from the normally taken function addressand is not safe to call. For example, with-fsanitize=cfi
, taking afunction address produces a callable pointer to a CFI jump table, while__builtin_function_start
returns an address that failscfi-icall checks.
__builtin_operator_new
and__builtin_operator_delete
¶
A call to__builtin_operator_new(args)
is exactly the same as a call to::operatornew(args)
, except that it allows certain optimizationsthat the C++ standard does not permit for a direct function call to::operatornew
(in particular, removingnew
/delete
pairs andmerging allocations), and that the call is required to resolve to areplaceable global allocation function.
Likewise,__builtin_operator_delete
is exactly the same as a call to::operatordelete(args)
, except that it permits optimizationsand that the call is required to resolve to areplaceable global deallocation function.
These builtins are intended for use in the implementation ofstd::allocator
and other similar allocation libraries, and are only available in C++.
Query for this feature with__has_builtin(__builtin_operator_new)
or__has_builtin(__builtin_operator_delete)
:
If the value is at least
201802L
, the builtins behave as described above.If the value is non-zero, the builtins may not support calling arbitraryreplaceable global (de)allocation functions, but do support calling at least
::operatornew(size_t)
and::operatordelete(void*)
.
__builtin_trivially_relocate
¶
Syntax:
T*__builtin_trivially_relocate(T*dest,T*src,size_tcount)
Trivially relocatescount
objects of relocatable, complete typeT
fromsrc
todest
and returnsdest
.This builtin is used to implementstd::trivially_relocate
.
__builtin_invoke
¶
Syntax:
template<classCallee,class...Args>decltype(auto)__builtin_invoke(Callee&&callee,Args&&...args);
__builtin_invoke
is equivalent tostd::invoke
.
__builtin_preserve_access_index
¶
__builtin_preserve_access_index
specifies a code section wherearray subscript access and structure/union member access are relocatableunder bpf compile-once run-everywhere framework. Debuginfo (typicallywith-g
) is needed, otherwise, the compiler will exit with an error.The return type for the intrinsic is the same as the type of theargument.
Syntax:
type__builtin_preserve_access_index(typearg)
Example of Use:
structt{inti;intj;union{inta;intb;}c[4];};structt*v=...;int*pb=__builtin_preserve_access_index(&v->c[3].b);__builtin_preserve_access_index(v->j);
__builtin_debugtrap
¶
__builtin_debugtrap
causes the program to stop its execution in such a way that a debugger can catch it.
Syntax:
__builtin_debugtrap()
Description
__builtin_debugtrap
is lowered to the `llvm.debugtrap
<https://llvm.org/docs/LangRef.html#llvm-debugtrap-intrinsic>`_ builtin. It should have the same effect as setting a breakpoint on the line where the builtin is called.
Query for this feature with__has_builtin(__builtin_debugtrap)
.
__builtin_trap
¶
__builtin_trap
causes the program to stop its execution abnormally.
Syntax:
__builtin_trap()
Description
__builtin_trap
is lowered to the `llvm.trap
<https://llvm.org/docs/LangRef.html#llvm-trap-intrinsic>`_ builtin.
Query for this feature with__has_builtin(__builtin_trap)
.
__builtin_arm_trap
¶
__builtin_arm_trap
is an AArch64 extension to__builtin_trap
which also accepts a compile-time constant value, encoded directly into the trap instruction for later inspection.
Syntax:
__builtin_arm_trap(constunsignedshortpayload)
Description
__builtin_arm_trap
is lowered to thellvm.aarch64.break
builtin, and then tobrk#payload
.
__builtin_verbose_trap
¶
__builtin_verbose_trap
causes the program to stop its execution abnormallyand shows a human-readable description of the reason for the termination when adebugger is attached or in a symbolicated crash log.
Syntax:
__builtin_verbose_trap(constchar*category,constchar*reason)
Description
__builtin_verbose_trap
is lowered to the `llvm.trap
<https://llvm.org/docs/LangRef.html#llvm-trap-intrinsic>`_ builtin.Additionally, clang emits debugging information that represents an artificialinline frame whose name encodes the category and reason strings passed to the builtin,prefixed by a “magic” prefix.
For example, consider the following code:
voidfoo(int*p){if(p==nullptr)__builtin_verbose_trap("check null","Argument must not be null!");}
The debugging information would look as if it were produced for the following code:
__attribute__((always_inline))inlinevoid"__clang_trap_msg$check null$Argument must not be null!"(){__builtin_trap();}voidfoo(int*p){if(p==nullptr)"__clang_trap_msg$check null$Argument must not be null!"();}
However, the generated code would not actually contain a call to the artificialfunction — it only exists in the debugging information.
Query for this feature with__has_builtin(__builtin_verbose_trap)
. Note thatusers need to enable debug information to enable this feature. A call to thisbuiltin is equivalent to a call to__builtin_trap
if debug information isn’tenabled.
The optimizer can merge calls to trap with different messages, which degradesthe debugging experience.
__builtin_allow_runtime_check
¶
__builtin_allow_runtime_check
returns true if the check at the currentprogram location should be executed. It is expected to be used to implementassert
like checks which can be safely removed by optimizer.
Syntax:
bool__builtin_allow_runtime_check(constchar*kind)
Example of use:
if(__builtin_allow_runtime_check("mycheck")&&!ExpensiveCheck()){abort();}
Description
__builtin_allow_runtime_check
is lowered to thellvm.allow.runtime.checkintrinsic.
The__builtin_allow_runtime_check()
can be used within control structureslikeif
to guard expensive runtime checks. The return value is determinedby the following compiler options and may differ per call site:
-mllvm-lower-allow-check-percentile-cutoff-hot=N
: Disable checks in hotcode marked by the profile summary with a hotness cutoff in the range[0,999999]
(a larger N disables more checks).-mllvm-lower-allow-check-random-rate=P
: Keep a check with probability P,a floating point number in the range[0.0,1.0]
.If both options are specified, a check is disabled if either condition is satisfied.
If neither is specified, all checks are allowed.
Parameterkind
, currently unused, is a string literal specifying the checkkind. Future compiler versions may use this to allow for more granular control,such as applying different hotness cutoffs to different check kinds.
Query for this feature with__has_builtin(__builtin_allow_runtime_check)
.
__builtin_nondeterministic_value
¶
__builtin_nondeterministic_value
returns a valid nondeterministic value of the same type as the provided argument.
Syntax:
type__builtin_nondeterministic_value(typex)
Examples:
intx=__builtin_nondeterministic_value(x);floaty=__builtin_nondeterministic_value(y);__m256ia=__builtin_nondeterministic_value(a);
Description
Each call to__builtin_nondeterministic_value
returns a valid value of the type given by the argument.
The types currently supported are: integer types, floating-point types, vector types.
Query for this feature with__has_builtin(__builtin_nondeterministic_value)
.
__builtin_sycl_unique_stable_name
¶
__builtin_sycl_unique_stable_name()
is a builtin that takes a type andproduces a string literal containing a unique name for the type that is stableacross split compilations, mainly to support SYCL/Data Parallel C++ language.
In cases where the split compilation needs to share a unique token for a typeacross the boundary (such as in an offloading situation), this name can be usedfor lookup purposes, such as in the SYCL Integration Header.
The value of this builtin is computed entirely at compile time, so it can beused in constant expressions. This value encodes lambda functions based on astable numbering order in which they appear in their local declaration contexts.Once this builtin is evaluated in a constexpr context, it is erroneous to useit in an instantiation which changes its value.
In order to produce the unique name, the current implementation of the builtinuses Itanium mangling even if the host compilation uses a different namemangling scheme at runtime. The mangler marks all the lambdas required to namethe SYCL kernel and emits a stable local ordering of the respective lambdas.The resulting pattern is demanglable. When non-lambda types are passed to thebuiltin, the mangler emits their usual pattern without any special treatment.
Syntax:
// Computes a unique stable name for the given type.constexprconstchar*__builtin_sycl_unique_stable_name(type-id);
__builtin_popcountg
¶
__builtin_popcountg
returns the number of 1 bits in the argument. Theargument can be of any unsigned integer type.
Syntax:
int__builtin_popcountg(typex)
Examples:
unsignedintx=1;intx_pop=__builtin_popcountg(x);unsignedlongy=3;inty_pop=__builtin_popcountg(y);unsigned_BitInt(128)z=7;intz_pop=__builtin_popcountg(z);
Description:
__builtin_popcountg
is meant to be a type-generic alternative to the__builtin_popcount{,l,ll}
builtins, with support for other integer types,such asunsigned__int128
and C23unsigned_BitInt(N)
.
__builtin_clzg
and__builtin_ctzg
¶
__builtin_clzg
(respectively__builtin_ctzg
) returns the number ofleading (respectively trailing) 0 bits in the first argument. The first argumentcan be of any unsigned integer type.
If the first argument is 0 and an optional second argument ofint
type isprovided, then the second argument is returned. If the first argument is 0, butonly one argument is provided, then the behavior is undefined.
Syntax:
int__builtin_clzg(typex[,intfallback])int__builtin_ctzg(typex[,intfallback])
Examples:
unsignedintx=1;intx_lz=__builtin_clzg(x);intx_tz=__builtin_ctzg(x);unsignedlongy=2;inty_lz=__builtin_clzg(y);inty_tz=__builtin_ctzg(y);unsigned_BitInt(128)z=4;intz_lz=__builtin_clzg(z);intz_tz=__builtin_ctzg(z);
Description:
__builtin_clzg
(respectively__builtin_ctzg
) is meant to be atype-generic alternative to the__builtin_clz{,l,ll}
(respectively__builtin_ctz{,l,ll}
) builtins, with support for other integer types, suchasunsigned__int128
and C23unsigned_BitInt(N)
.
__builtin_counted_by_ref
¶
__builtin_counted_by_ref
returns a pointer to the count field from thecounted_by
attribute.
The argument must be a flexible array member. If the argument isn’t a flexiblearray member or doesn’t have thecounted_by
attribute, the builtin returns(void*)0
.
Syntax:
T*__builtin_counted_by_ref(void*array)
Examples:
#define alloc(P, FAM, COUNT) ({ \ size_t __ignored_assignment; \ typeof(P) __p = NULL; \ __p = malloc(MAX(sizeof(*__p), \ sizeof(*__p) + sizeof(*__p->FAM) * COUNT)); \ \ *_Generic( \ __builtin_counted_by_ref(__p->FAM), \ void *: &__ignored_assignment, \ default: __builtin_counted_by_ref(__p->FAM)) = COUNT; \ \ __p; \})
Description:
The__builtin_counted_by_ref
builtin allows the programmer to prevent acommon error associated with thecounted_by
attribute. When using thecounted_by
attribute, thecount
fieldmust be set before theflexible array member can be accessed. Otherwise, the sanitizers may view suchaccesses as false positives. For instance, it’s not uncommon for programmers toinitialize the flexible array before setting thecount
field:
structs{intdummy;shortcount;longarray[]__attribute__((counted_by(count)));};structs*ptr=malloc(sizeof(structs)+sizeof(long)*COUNT);for(inti=0;i<COUNT;++i)ptr->array[i]=i;ptr->count=COUNT;
Enforcing the rule thatptr->count=COUNT;
must occur after everyallocation of a struct with a flexible array member with thecounted_by
attribute is prone to failure in large code bases. This builtin mitigates thisfor allocators (like in Linux) that are implemented in a way where the counterassignment can happen automatically.
Note: The value returned by__builtin_counted_by_ref
cannot be assignedto a variable, have its address taken, or passed into or returned from afunction, because doing so violates bounds safety conventions.
Multiprecision Arithmetic Builtins¶
Clang provides a set of builtins which expose multiprecision arithmetic in amanner amenable to C. They all have the following form:
unsignedx=...,y=...,carryin=...,carryout;unsignedsum=__builtin_addc(x,y,carryin,&carryout);
Thus one can form a multiprecision addition chain in the following manner:
unsigned*x,*y,*z,carryin=0,carryout;z[0]=__builtin_addc(x[0],y[0],carryin,&carryout);carryin=carryout;z[1]=__builtin_addc(x[1],y[1],carryin,&carryout);carryin=carryout;z[2]=__builtin_addc(x[2],y[2],carryin,&carryout);carryin=carryout;z[3]=__builtin_addc(x[3],y[3],carryin,&carryout);
The complete list of builtins are:
unsignedchar__builtin_addcb(unsignedcharx,unsignedchary,unsignedcharcarryin,unsignedchar*carryout);unsignedshort__builtin_addcs(unsignedshortx,unsignedshorty,unsignedshortcarryin,unsignedshort*carryout);unsigned__builtin_addc(unsignedx,unsignedy,unsignedcarryin,unsigned*carryout);unsignedlong__builtin_addcl(unsignedlongx,unsignedlongy,unsignedlongcarryin,unsignedlong*carryout);unsignedlonglong__builtin_addcll(unsignedlonglongx,unsignedlonglongy,unsignedlonglongcarryin,unsignedlonglong*carryout);unsignedchar__builtin_subcb(unsignedcharx,unsignedchary,unsignedcharcarryin,unsignedchar*carryout);unsignedshort__builtin_subcs(unsignedshortx,unsignedshorty,unsignedshortcarryin,unsignedshort*carryout);unsigned__builtin_subc(unsignedx,unsignedy,unsignedcarryin,unsigned*carryout);unsignedlong__builtin_subcl(unsignedlongx,unsignedlongy,unsignedlongcarryin,unsignedlong*carryout);unsignedlonglong__builtin_subcll(unsignedlonglongx,unsignedlonglongy,unsignedlonglongcarryin,unsignedlonglong*carryout);
Checked Arithmetic Builtins¶
Clang provides a set of builtins that implement checked arithmetic for securitycritical applications in a manner that is fast and easily expressible in C. Asan example of their usage:
errorcode_tsecurity_critical_application(...){unsignedx,y,result;...if(__builtin_mul_overflow(x,y,&result))returnkErrorCodeHackers;...use_multiply(result);...}
Clang provides the following checked arithmetic builtins:
bool__builtin_add_overflow(type1x,type2y,type3*sum);bool__builtin_sub_overflow(type1x,type2y,type3*diff);bool__builtin_mul_overflow(type1x,type2y,type3*prod);bool__builtin_uadd_overflow(unsignedx,unsignedy,unsigned*sum);bool__builtin_uaddl_overflow(unsignedlongx,unsignedlongy,unsignedlong*sum);bool__builtin_uaddll_overflow(unsignedlonglongx,unsignedlonglongy,unsignedlonglong*sum);bool__builtin_usub_overflow(unsignedx,unsignedy,unsigned*diff);bool__builtin_usubl_overflow(unsignedlongx,unsignedlongy,unsignedlong*diff);bool__builtin_usubll_overflow(unsignedlonglongx,unsignedlonglongy,unsignedlonglong*diff);bool__builtin_umul_overflow(unsignedx,unsignedy,unsigned*prod);bool__builtin_umull_overflow(unsignedlongx,unsignedlongy,unsignedlong*prod);bool__builtin_umulll_overflow(unsignedlonglongx,unsignedlonglongy,unsignedlonglong*prod);bool__builtin_sadd_overflow(intx,inty,int*sum);bool__builtin_saddl_overflow(longx,longy,long*sum);bool__builtin_saddll_overflow(longlongx,longlongy,longlong*sum);bool__builtin_ssub_overflow(intx,inty,int*diff);bool__builtin_ssubl_overflow(longx,longy,long*diff);bool__builtin_ssubll_overflow(longlongx,longlongy,longlong*diff);bool__builtin_smul_overflow(intx,inty,int*prod);bool__builtin_smull_overflow(longx,longy,long*prod);bool__builtin_smulll_overflow(longlongx,longlongy,longlong*prod);
Each builtin performs the specified mathematical operation on thefirst two arguments and stores the result in the third argument. Ifpossible, the result will be equal to mathematically-correct resultand the builtin will return 0. Otherwise, the builtin will return1 and the result will be equal to the unique value that is equivalentto the mathematically-correct result modulo two raised to thekpower, wherek is the number of bits in the result type. Thebehavior of these builtins is well-defined for all argument values.
The first three builtins work generically for operands of any integer type,including boolean types. The operands need not have the same type as eachother, or as the result. The other builtins may implicitly promote orconvert their operands before performing the operation.
Query for this feature with__has_builtin(__builtin_add_overflow)
, etc.
Floating point builtins¶
__builtin_isfpclass
¶
__builtin_isfpclass
is used to test if the specified floating-point valuesfall into one of the specified floating-point classes.
Syntax:
int__builtin_isfpclass(fp_typeexpr,intmask)int_vector__builtin_isfpclass(fp_vectorexpr,intmask)
Example of use:
if(__builtin_isfpclass(x,448)){// `x` is positive finite value...}
Description:
The__builtin_isfpclass()
builtin is a generalization of functionsisnan
,isinf
,isfinite
and some others defined by the C standard. It tests ifthe floating-point value, specified by the first argument, falls into any of dataclasses, specified by the second argument. The latter is an integer constantbitmask expression, in which each data class is represented by a bitusing the encoding:
Mask value | Data class | Macro |
---|---|---|
0x0001 | Signaling NaN | __FPCLASS_SNAN |
0x0002 | Quiet NaN | __FPCLASS_QNAN |
0x0004 | Negative infinity | __FPCLASS_NEGINF |
0x0008 | Negative normal | __FPCLASS_NEGNORMAL |
0x0010 | Negative subnormal | __FPCLASS_NEGSUBNORMAL |
0x0020 | Negative zero | __FPCLASS_NEGZERO |
0x0040 | Positive zero | __FPCLASS_POSZERO |
0x0080 | Positive subnormal | __FPCLASS_POSSUBNORMAL |
0x0100 | Positive normal | __FPCLASS_POSNORMAL |
0x0200 | Positive infinity | __FPCLASS_POSINF |
For convenience preprocessor defines macros for these values. The functionreturns 1 ifexpr
falls into one of the specified data classes, 0 otherwise.
In the example above the mask value 448 (0x1C0) contains the bits selectingpositive zero, positive subnormal and positive normal classes.__builtin_isfpclass(x,448)
would return true only ifx
if of any ofthese data classes. Using suitable mask value, the function can implement any ofthe standard classification functions, for example,__builtin_isfpclass(x,3)
is identical toisnan
,``__builtin_isfpclass(x, 504)`` - toisfinite
and so on.
If the first argument is a vector, the function is equivalent to the set ofscalar calls of__builtin_isfpclass
applied to the input elementwise.
The result of__builtin_isfpclass
is a boolean value, if the first argumentis a scalar, or an integer vector with the same element count as the firstargument. The element type in this vector has the same bit length as theelement of the first argument type.
This function never raises floating-point exceptions and does not canonicalizeits input. The floating-point argument is not promoted, its data class isdetermined based on its representation in its actual semantic type.
__builtin_canonicalize
¶
double__builtin_canonicalize(double);float__builtin_canonicalizef(float);longdouble__builtin_canonicalizel(longdouble);
Returns the platform specific canonical encoding of a floating pointnumber. This canonicalization is useful for implementing certainnumeric primitives such as frexp. SeeLLVM canonicalize intrinsic formore information on the semantics.
__builtin_flt_rounds
and__builtin_set_flt_rounds
¶
int__builtin_flt_rounds();void__builtin_set_flt_rounds(int);
Returns and sets current floating point rounding mode. The encoding of returnedvalues and input parameters is same as the result of FLT_ROUNDS, specified by Cstandard:-0
- toward zero-1
- to nearest, ties to even-2
- toward positive infinity-3
- toward negative infinity-4
- to nearest, ties away from zeroThe effect of passing some other value to__builtin_flt_rounds
isimplementation-defined.__builtin_set_flt_rounds
is currently only supportedto work on x86, x86_64, powerpc, powerpc64, Arm and AArch64 targets. These builtinsread and modify the floating-point environment, which is not always allowed and mayhave unexpected behavior. Please see the section onAccessing the floating point environment for more information.
String builtins¶
Clang provides constant expression evaluation support for builtins forms ofthe following functions from the C standard library headers<string.h>
and<wchar.h>
:
memchr
memcmp
(and its deprecated BSD / POSIX aliasbcmp
)strchr
strcmp
strlen
strncmp
wcschr
wcscmp
wcslen
wcsncmp
wmemchr
wmemcmp
In each case, the builtin form has the name of the C library function prefixedby__builtin_
. Example:
void*p=__builtin_memchr("foobar",'b',5);
In addition to the above, one further builtin is provided:
char*__builtin_char_memchr(constchar*haystack,intneedle,size_tsize);
__builtin_char_memchr(a,b,c)
is identical to(char*)__builtin_memchr(a,b,c)
except that its use is permitted withinconstant expressions in C++11 onwards (where a cast fromvoid*
tochar*
is disallowed in general).
Constant evaluation support for the__builtin_mem*
functions is providedonly for arrays ofchar
,signedchar
,unsignedchar
, orchar8_t
,despite these functions accepting an argument of typeconstvoid*
.
Support for constant expression evaluation for the above builtins can be detectedwith__has_feature(cxx_constexpr_string_builtins)
.
Variadic function builtins¶
Clang provides several builtins for working with variadic functions from the Cstandard library<stdarg.h>
header:
__builtin_va_list
A predefined typedef for the target-specificva_list
type. It is undefinedbehavior to use a byte-wise copy of this type produced by callingmemcpy
,memmove
, or similar. Valid explicit copies are only produced by callingva_copy
or__builtin_va_copy
.
void__builtin_va_start(__builtin_va_listlist,<parameter-name>)
A builtin function for the target-specificva_start
function-like macro.Theparameter-name
argument is the name of the parameter preceding theellipsis (...
) in the function signature. Alternatively, in C23 mode orlater, it may be the integer literal0
if there is no parameter precedingthe ellipsis. This function initializes the given__builtin_va_list
object.It is undefined behavior to call this function on an already initialized__builtin_va_list
object.
void__builtin_c23_va_start(__builtin_va_listlist,...)
A builtin function for the target-specificva_start
function-like macro,available only in C23 and later. The builtin accepts zero or one argument forthe ellipsis (...
). If such an argument is provided, it should be the nameof the parameter preceding the ellipsis, which is used for compatibility withC versions before C23. It is an error to provide two or more variadic arguments.This function initializes the given__builtin_va_list
object. It isundefined behavior to call this function on an already initialized__builtin_va_list
object.
void__builtin_va_end(__builtin_va_listlist)
A builtin function for the target-specificva_end
function-like macro. Thisfunction finalizes the given__builtin_va_list
object such that it is nolonger usable unless re-initialized with a call to__builtin_va_start
or__builtin_va_copy
. It is undefined behavior to call this function with alist
that has not been initialized by either__builtin_va_start
or__builtin_va_copy
.
<type-name>__builtin_va_arg(__builtin_va_listlist,<type-name>)
A builtin function for the target-specificva_arg
function-like macro. Thisfunction returns the value of the next variadic argument to the call. It isundefined behavior to call this builtin when there is no next variadic argumentto retrieve or if the next variadic argument does not have a type compatiblewith the giventype-name
. The return type of the function is thetype-name
given as the second argument. It is undefined behavior to callthis function with alist
that has not been initialized by either__builtin_va_start
or__builtin_va_copy
.
void__builtin_va_copy(__builtin_va_listdest,__builtin_va_listsrc)
A builtin function for the target-specificva_copy
function-like macro.This function initializesdest
as a copy ofsrc
. It is undefinedbehavior to call this function with an already initializeddest
argument.
Memory builtins¶
Clang provides constant expression evaluation support for builtin forms of thefollowing functions from the C standard library headers<string.h>
and<wchar.h>
:
memcpy
memmove
wmemcpy
wmemmove
In each case, the builtin form has the name of the C library function prefixedby__builtin_
.
Constant evaluation support is only provided when the source and destinationare pointers to arrays with the same trivially copyable element type, and thegiven size is an exact multiple of the element size that is no greater thanthe number of elements accessible through the source and destination operands.
Guaranteed inlined copy¶
void__builtin_memcpy_inline(void*dst,constvoid*src,size_tsize);
__builtin_memcpy_inline
has been designed as a building block for efficientmemcpy
implementations. It is identical to__builtin_memcpy
but alsoguarantees not to call any external functions. See LLVM IRllvm.memcpy.inline intrinsicfor more information.
This is useful to implement a custom version ofmemcpy
, implement alibc
memcpy or work around the absence of alibc
.
Note that thesize argument must be a compile time constant.
Note that this intrinsic cannot yet be called in aconstexpr
context.
Guaranteed inlined memset¶
void__builtin_memset_inline(void*dst,intvalue,size_tsize);
__builtin_memset_inline
has been designed as a building block for efficientmemset
implementations. It is identical to__builtin_memset
but alsoguarantees not to call any external functions. See LLVM IRllvm.memset.inline intrinsicfor more information.
This is useful to implement a custom version ofmemset
, implement alibc
memset or work around the absence of alibc
.
Note that thesize argument must be a compile time constant.
Note that this intrinsic cannot yet be called in aconstexpr
context.
__is_bitwise_cloneable
¶
A type trait is used to check whether a type can be safely copied by memcpy.
Syntax:
bool__is_bitwise_cloneable(Type)
Description:
Objects of bitwise cloneable types can be bitwise copied by memcpy/memmove. TheClang compiler warrants that this behavior is well defined, and won’t bebroken by compiler optimizations and sanitizers.
For implicit-lifetime types, the lifetime of the new object is implicitlystarted after the copy. For other types (e.g., classes with virtual methods),the lifetime isn’t started, and using the object results in undefined behavioraccording to the C++ Standard.
This builtin can be used in constant expressions.
Atomic Min/Max builtins with memory ordering¶
There are two atomic builtins with min/max in-memory comparison and swap.The syntax and semantics are similar to GCC-compatible __atomic_* builtins.
__atomic_fetch_min
__atomic_fetch_max
The builtins work with signed and unsigned integers and require to specify memory ordering.The return value is the original value that was stored in memory before comparison.
Example:
unsignedintval=__atomic_fetch_min(unsignedint*pi,unsignedintui,__ATOMIC_RELAXED);
The third argument is one of the memory ordering specifiers__ATOMIC_RELAXED
,__ATOMIC_CONSUME
,__ATOMIC_ACQUIRE
,__ATOMIC_RELEASE
,__ATOMIC_ACQ_REL
, or__ATOMIC_SEQ_CST
following C++11 memory model semantics.
In terms of acquire-release ordering barriers these two operations are alwaysconsidered as operations withload-store semantics, even when the original valueis not actually modified after comparison.
__c11_atomic builtins¶
Clang provides a set of builtins which are intended to be used to implementC11’s<stdatomic.h>
header. These builtins provide the semantics of the_explicit
form of the corresponding C11 operation, and are named with a__c11_
prefix. The supported operations, and the differences fromthe corresponding C11 operations, are:
__c11_atomic_init
__c11_atomic_thread_fence
__c11_atomic_signal_fence
__c11_atomic_is_lock_free
(The argument is the size of the_Atomic(...)
object, instead of its address)__c11_atomic_store
__c11_atomic_load
__c11_atomic_exchange
__c11_atomic_compare_exchange_strong
__c11_atomic_compare_exchange_weak
__c11_atomic_fetch_add
__c11_atomic_fetch_sub
__c11_atomic_fetch_and
__c11_atomic_fetch_or
__c11_atomic_fetch_xor
__c11_atomic_fetch_nand
(Nand is not presented in<stdatomic.h>
)__c11_atomic_fetch_max
__c11_atomic_fetch_min
The macros__ATOMIC_RELAXED
,__ATOMIC_CONSUME
,__ATOMIC_ACQUIRE
,__ATOMIC_RELEASE
,__ATOMIC_ACQ_REL
, and__ATOMIC_SEQ_CST
areprovided, with values corresponding to the enumerators of C11’smemory_order
enumeration.
(Note that Clang additionally provides GCC-compatible__atomic_*
builtins and OpenCL 2.0__opencl_atomic_*
builtins. The OpenCL 2.0atomic builtins are an explicit form of the corresponding OpenCL 2.0builtin function, and are named with a__opencl_
prefix. The macros__OPENCL_MEMORY_SCOPE_WORK_ITEM
,__OPENCL_MEMORY_SCOPE_WORK_GROUP
,__OPENCL_MEMORY_SCOPE_DEVICE
,__OPENCL_MEMORY_SCOPE_ALL_SVM_DEVICES
,and__OPENCL_MEMORY_SCOPE_SUB_GROUP
are provided, with valuescorresponding to the enumerators of OpenCL’smemory_scope
enumeration.)
__scoped_atomic builtins¶
Clang provides a set of atomics taking a memory scope argument. These atomicsare identical to the standard GNU / GCC atomic builtins but taking an extramemory scope argument. These are designed to be a generic alternative to the__opencl_atomic_*
builtin functions for targets that support atomic memoryscopes.
Atomic memory scopes are designed to assist optimizations for systems withseveral levels of memory hierarchy like GPUs. The following memory scopes arecurrently supported:
__MEMORY_SCOPE_SYSTEM
__MEMORY_SCOPE_DEVICE
__MEMORY_SCOPE_WRKGRP
__MEMORY_SCOPE_WVFRNT
__MEMORY_SCOPE_SINGLE
This controls whether or not the atomic operation is ordered with respect to thewhole system, the current device, an OpenCL workgroup, wavefront, or just asingle thread. If these are used on a target that does not support atomicscopes, then they will behave exactly as the standard GNU atomic builtins.
Low-level ARM exclusive memory builtins¶
Clang provides overloaded builtins giving direct access to the three key ARMinstructions for implementing atomic operations.
T__builtin_arm_ldrex(constvolatileT*addr);T__builtin_arm_ldaex(constvolatileT*addr);int__builtin_arm_strex(Tval,volatileT*addr);int__builtin_arm_stlex(Tval,volatileT*addr);void__builtin_arm_clrex(void);
The typesT
currently supported are:
Integer types with width at most 64 bits (or 128 bits on AArch64).
Floating-point types
Pointer types.
Note that the compiler does not guarantee it will not insert stores which clearthe exclusive monitor in between anldrex
type operation and its pairedstrex
. In practice this is only usually a risk when the extra store is onthe same cache line as the variable being modified and Clang will only insertstack stores on its own, so it is best not to use these operations on variableswith automatic storage duration.
Also, loads and stores may be implicit in code written between theldrex
andstrex
. Clang will not necessarily mitigate the effects of these either, socare should be exercised.
For these reasons the higher level atomic primitives should be preferred wherepossible.
Non-temporal load/store builtins¶
Clang provides overloaded builtins allowing generation of non-temporal memoryaccesses.
T__builtin_nontemporal_load(T*addr);void__builtin_nontemporal_store(Tvalue,T*addr);
The typesT
currently supported are:
Integer types.
Floating-point types.
Vector types.
Note that the compiler does not guarantee that non-temporal loads or storeswill be used.
C++ Coroutines support builtins¶
Warning
This is a work in progress. Compatibility across Clang/LLVM releases is notguaranteed.
Clang provides experimental builtins to support C++ Coroutines as defined byhttps://wg21.link/P0057. The following four are intended to be used by thestandard library to implement thestd::coroutine_handle
type.
Syntax:
void__builtin_coro_resume(void*addr);void__builtin_coro_destroy(void*addr);bool__builtin_coro_done(void*addr);void*__builtin_coro_promise(void*addr,intalignment,boolfrom_promise)
Example of use:
template<>structcoroutine_handle<void>{voidresume()const{__builtin_coro_resume(ptr);}voiddestroy()const{__builtin_coro_destroy(ptr);}booldone()const{return__builtin_coro_done(ptr);}// ...protected:void*ptr;};template<typenamePromise>structcoroutine_handle:coroutine_handle<>{// ...Promise&promise()const{return*reinterpret_cast<Promise*>(__builtin_coro_promise(ptr,alignof(Promise),/*from-promise=*/false));}staticcoroutine_handlefrom_promise(Promise&promise){coroutine_handlep;p.ptr=__builtin_coro_promise(&promise,alignof(Promise),/*from-promise=*/true);returnp;}};
Other coroutine builtins are either for internal clang use or for use duringdevelopment of the coroutine feature. SeeCoroutines in LLVM formore information on their semantics. Note that builtins matching the intrinsicsthat take token as the first parameter (llvm.coro.begin, llvm.coro.alloc,llvm.coro.free and llvm.coro.suspend) omit the token parameter and fill it toan appropriate value during the emission.
Syntax:
size_t__builtin_coro_size()void*__builtin_coro_frame()void*__builtin_coro_free(void*coro_frame)void*__builtin_coro_id(intalign,void*promise,void*fnaddr,void*parts)bool__builtin_coro_alloc()void*__builtin_coro_begin(void*memory)void__builtin_coro_end(void*coro_frame,boolunwind)char__builtin_coro_suspend(boolfinal)
Note that there is no builtin matching thellvm.coro.save intrinsic. LLVMautomatically will insert one if the first argument tollvm.coro.suspend istokennone. If a user calls__builtin_suspend, clang will inserttoken noneas the first argument to the intrinsic.
Source location builtins¶
Clang provides builtins to support C++ standard library implementationofstd::source_location
as specified in C++20. With the exceptionof__builtin_COLUMN
,__builtin_FILE_NAME
and__builtin_FUNCSIG
,these builtins are also implemented by GCC.
Syntax:
constchar*__builtin_FILE();constchar*__builtin_FILE_NAME();// Clang onlyconstchar*__builtin_FUNCTION();constchar*__builtin_FUNCSIG();// Microsoftunsigned__builtin_LINE();unsigned__builtin_COLUMN();// Clang onlyconststd::source_location::__impl*__builtin_source_location();
Example of use:
voidmy_assert(boolpred,intline=__builtin_LINE(),// Captures line of callerconstchar*file=__builtin_FILE(),constchar*function=__builtin_FUNCTION()){if(pred)return;printf("%s:%d assertion failed in function %s\n",file,line,function);std::abort();}structMyAggregateType{intx;intline=__builtin_LINE();// captures line where aggregate initialization occurs};static_assert(MyAggregateType{42}.line==__LINE__);structMyClassType{intline=__builtin_LINE();// captures line of the constructor used during initializationconstexprMyClassType(int){assert(line==__LINE__);}};
Description:
The builtins__builtin_LINE
,__builtin_FUNCTION
,__builtin_FUNCSIG
,__builtin_FILE
and__builtin_FILE_NAME
return the values, at the“invocation point”, for__LINE__
,__FUNCTION__
,__FUNCSIG__
,__FILE__
and__FILE_NAME__
respectively.__builtin_COLUMN
similarlyreturns the column, though there is no corresponding macro. These builtins areconstant expressions.
When the builtins appear as part of a default function argument the invocationpoint is the location of the caller. When the builtins appear as part of adefault member initializer, the invocation point is the location of theconstructor or aggregate initialization used to create the object. Otherwisethe invocation point is the same as the location of the builtin.
When the invocation point of__builtin_FUNCTION
is not a function scope, theempty string is returned.
The builtin__builtin_COLUMN
returns the offset from the start of the line,beginning from column 1.This may differ from other implementations.
The builtin__builtin_source_location
returns a pointer to constant staticdata of typestd::source_location::__impl
. This type must have already beendefined, and must contain exactly four fields:constchar*_M_file_name
,constchar*_M_function_name
,<any-integral-type>_M_line
, and<any-integral-type>_M_column
. The fields will be populated in the samemanner as the above four builtins, except that_M_function_name
is populatedwith__PRETTY_FUNCTION__
rather than__FUNCTION__
.
Alignment builtins¶
Clang provides builtins to support checking and adjusting alignment ofpointers and integers.These builtins can be used to avoid relying on implementation-defined behaviorof arithmetic on integers derived from pointers.Additionally, these builtins retain type information and, unlike bitwisearithmetic, they can perform semantic checking on the alignment value.
Syntax:
Type__builtin_align_up(Typevalue,size_talignment);Type__builtin_align_down(Typevalue,size_talignment);bool__builtin_is_aligned(Typevalue,size_talignment);
Example of use:
char*global_alloc_buffer;void*my_aligned_allocator(size_talloc_size,size_talignment){char*result=__builtin_align_up(global_alloc_buffer,alignment);// result now contains the value of global_alloc_buffer rounded up to the// next multiple of alignment.global_alloc_buffer=result+alloc_size;returnresult;}void*get_start_of_page(void*ptr){return__builtin_align_down(ptr,PAGE_SIZE);}voidexample(char*buffer){if(__builtin_is_aligned(buffer,64)){do_fast_aligned_copy(buffer);}else{do_unaligned_copy(buffer);}}// In addition to pointers, the builtins can also be used on integer types// and are evaluatable inside constant expressions.static_assert(__builtin_align_up(123,64)==128,"");static_assert(__builtin_align_down(123u,64)==64u,"");static_assert(!__builtin_is_aligned(123,64),"");
Description:
The builtins__builtin_align_up
,__builtin_align_down
, return theirfirst argument aligned up/down to the next multiple of the second argument.If the value is already sufficiently aligned, it is returned unchanged.The builtin__builtin_is_aligned
returns whether the first argument isaligned to a multiple of the second argument.All of these builtins expect the alignment to be expressed as a number of bytes.
These builtins can be used for all integer types as well as (non-function)pointer types. For pointer types, these builtins operate in terms of the integeraddress of the pointer and return a new pointer of the same type (includingqualifiers such asconst
) with an adjusted address.When aligning pointers up or down, the resulting value must be within the sameunderlying allocation or one past the end (see C17 6.5.6p8, C++ [expr.add]).This means that arbitrary integer values stored in pointer-type variables mustnot be passed to these builtins. For those use cases, the builtins can still beused, but the operation must be performed on the pointer cast touintptr_t
.
If Clang can determine that the alignment is not a power of two at compile time,it will result in a compilation failure. If the alignment argument is not apower of two at run time, the behavior of these builtins is undefined.
Non-standard C++11 Attributes¶
Clang’s non-standard C++11 attributes live in theclang
attributenamespace.
Clang supports GCC’sgnu
attribute namespace. All GCC attributes whichare accepted with the__attribute__((foo))
syntax are also accepted as[[gnu::foo]]
. This only extends to attributes which are specified by GCC(see the list ofGCC function attributes,GCC variableattributes, andGCC type attributes). As with the GCCimplementation, these attributes must appertain to thedeclarator-id in adeclaration, which means they must go either at the start of the declaration orimmediately after the name being declared.
For example, this applies the GNUunused
attribute toa
andf
, andalso applies the GNUnoreturn
attribute tof
.
Examples:.. code-block:: c++
[[gnu::unused]] int a, f [[gnu::noreturn]] ();
Target-Specific Extensions¶
Clang supports some language features conditionally on some targets.
AMDGPU Language Extensions¶
__builtin_amdgcn_fence¶
__builtin_amdgcn_fence
emits a fence.
unsigned
atomic ordering, e.g.__ATOMIC_ACQUIRE
constchar*
synchronization scope, e.g.workgroup
Zero or more
constchar*
address spaces names.
The address spaces arguments must be one of the following string literals:
"local"
"global"
If one or more address space name are provided, the code generator will attemptto emit potentially faster instructions that order access to at least thoseaddress spaces.Emitting such instructions may not always be possible and the compiler is freeto fence more aggressively.
If no address spaces names are provided, all address spaces are fenced.
// Fence all address spaces.__builtin_amdgcn_fence(__ATOMIC_SEQ_CST,"workgroup");__builtin_amdgcn_fence(__ATOMIC_ACQUIRE,"agent");// Fence only requested address spaces.__builtin_amdgcn_fence(__ATOMIC_SEQ_CST,"workgroup","local")__builtin_amdgcn_fence(__ATOMIC_SEQ_CST,"workgroup","local","global")
ARM/AArch64 Language Extensions¶
Memory Barrier Intrinsics¶
Clang implements the__dmb
,__dsb
and__isb
intrinsics as definedin theArm C Language Extensions.Note that these intrinsics are implemented as motion barriers that blockreordering of memory accesses and side effect instructions. Other instructionslike simple arithmetic may be reordered around the intrinsic. If you expect tohave no reordering at all, use inline assembly instead.
Pointer Authentication¶
X86/X86-64 Language Extensions¶
The X86 backend has these language extensions:
Memory references to specified segments¶
Annotating a pointer with address space #256 causes it to be code generatedrelative to the X86 GS segment register, address space #257 causes it to berelative to the X86 FS segment, and address space #258 causes it to berelative to the X86 SS segment. Note that this is a very very low-levelfeature that should only be used if you know what you’re doing (for example inan OS kernel).
Here is an example:
#define GS_RELATIVE __attribute__((address_space(256)))intfoo(intGS_RELATIVE*P){return*P;}
Which compiles to (on X86-32):
_foo:movl4(%esp),%eaxmovl%gs:(%eax),%eaxret
You can also use the GCC compatibility macros__seg_fs
and__seg_gs
forthe same purpose. The preprocessor symbols__SEG_FS
and__SEG_GS
indicate their support.
PowerPC Language Extensions¶
Set the Floating Point Rounding Mode¶
PowerPC64/PowerPC64le supports the builtin function__builtin_setrnd
to setthe floating point rounding mode. This function will use the least significanttwo bits of integer argument to set the floating point rounding mode.
double__builtin_setrnd(intmode);
The effective values for mode are:
0 - round to nearest
1 - round to zero
2 - round to +infinity
3 - round to -infinity
Note that the mode argument will modulo 4, so if the integer argument is greaterthan 3, it will only use the least significant two bits of the mode.Namely,__builtin_setrnd(102))
is equal to__builtin_setrnd(2)
.
PowerPC cache builtins¶
The PowerPC architecture specifies instructions implementing cache operations.Clang provides builtins that give direct programmer access to these cacheinstructions.
Currently the following builtins are implemented in clang:
__builtin_dcbf
copies the contents of a modified block from the data cacheto main memory and flushes the copy from the data cache.
Syntax:
void__dcbf(constvoid*addr);/* Data Cache Block Flush */
Example of Use:
inta=1;__builtin_dcbf(&a);
Extensions for Static Analysis¶
Clang supports additional attributes that are useful for documenting programinvariants and rules for static analysis tools, such as theClang StaticAnalyzer. These attributes are documentedin the analyzer’slist of annotations for analysis.
Extensions for Dynamic Analysis¶
Use__has_feature(address_sanitizer)
to check if the code is being builtwithAddressSanitizer.
Use__has_feature(thread_sanitizer)
to check if the code is being builtwithThreadSanitizer.
Use__has_feature(memory_sanitizer)
to check if the code is being builtwithMemorySanitizer.
Use__has_feature(dataflow_sanitizer)
to check if the code is being builtwithDataFlowSanitizer.
Use__has_feature(safe_stack)
to check if the code is being builtwithSafeStack.
Extensions for selectively disabling optimization¶
Clang provides a mechanism for selectively disabling optimizations in functionsand methods.
To disable optimizations in a single function definition, the GNU-style or C++11non-standard attributeoptnone
can be used.
// The following functions will not be optimized.// GNU-style attribute__attribute__((optnone))intfoo(){// ... code}// C++11 attribute[[clang::optnone]]intbar(){// ... code}
To facilitate disabling optimization for a range of function definitions, arange-based pragma is provided. Its syntax is#pragmaclangoptimize
followed byoff
oron
.
All function definitions in the region between anoff
and the followingon
will be decorated with theoptnone
attribute unless doing so wouldconflict with explicit attributes already present on the function (e.g. theones that control inlining).
#pragma clang optimize off// This function will be decorated with optnone.intfoo(){// ... code}// optnone conflicts with always_inline, so bar() will not be decorated.__attribute__((always_inline))intbar(){// ... code}#pragma clang optimize on
If noon
is found to close anoff
region, the end of the region is theend of the compilation unit.
Note that a stray#pragmaclangoptimizeon
does not selectively enableadditional optimizations when compiling at low optimization levels. This featurecan only be used to selectively disable optimizations.
The pragma has an effect on functions only at the point of their definition; forfunction templates, this means that the state of the pragma at the point of aninstantiation is not necessarily relevant. Consider the following example:
template<typenameT>Ttwice(Tt){return2*t;}#pragma clang optimize offtemplate<typenameT>Tthrice(Tt){return3*t;}intcontainer(inta,intb){returntwice(a)+thrice(b);}#pragma clang optimize on
In this example, the definition of the template functiontwice
is outsidethe pragma region, whereas the definition ofthrice
is inside the region.Thecontainer
function is also in the region and will not be optimized, butit causes the instantiation oftwice
andthrice
with anint
type; ofthese two instantiations,twice
will be optimized (because its definitionwas outside the region) andthrice
will not be optimized.
Clang also implements MSVC’s range-based pragma,#pragmaoptimize("[optimization-list]",on|off)
. At the moment, Clang onlysupports an empty optimization list, whereas MSVC supports the arguments,s
,g
,t
, andy
. Currently, the implementation ofpragmaoptimize
behavesthe same as#pragmaclangoptimize
. All functionsbetweenoff
andon
will be decorated with theoptnone
attribute.
#pragma optimize("", off)// This function will be decorated with optnone.voidf1(){}#pragma optimize("", on)// This function will be optimized with whatever was specified on// the commandline.voidf2(){}// This will warn with Clang's current implementation.#pragma optimize("g", on)voidf3(){}
For MSVC, an empty optimization list andoff
parameter will turn offall optimizations,s
,g
,t
, andy
. An empty optimization andon
parameter will reset the optimizations to the ones specified on thecommandline.
Parameter | Type of optimization |
g | Deprecated |
s or t | Short or fast sequences of machine code |
y | Enable frame pointers |
Extensions for loop hint optimizations¶
The#pragmaclangloop
directive is used to specify hints for optimizing thesubsequent for, while, do-while, or c++11 range-based for loop. The directiveprovides options for vectorization, interleaving, predication, unrolling anddistribution. Loop hints can be specified before any loop and will be ignored ifthe optimization is not safe to apply.
There are loop hints that control transformations (e.g. vectorization, loopunrolling) and there are loop hints that set transformation options (e.g.vectorize_width
,unroll_count
). Pragmas setting transformation optionsimply the transformation is enabled, as if it was enabled via the correspondingtransformation pragma (e.g.vectorize(enable)
). If the transformation isdisabled (e.g.vectorize(disable)
), that takes precedence overtransformations option pragmas implying that transformation.
Vectorization, Interleaving, and Predication¶
A vectorized loop performs multiple iterations of the original loopin parallel using vector instructions. The instruction set of the targetprocessor determines which vector instructions are available and their vectorwidths. This restricts the types of loops that can be vectorized. The vectorizerautomatically determines if the loop is safe and profitable to vectorize. Avector instruction cost model is used to select the vector width.
Interleaving multiple loop iterations allows modern processors to furtherimprove instruction-level parallelism (ILP) using advanced hardware features,such as multiple execution units and out-of-order execution. The vectorizer usesa cost model that depends on the register pressure and generated code size toselect the interleaving count.
Vectorization is enabled byvectorize(enable)
and interleaving is enabledbyinterleave(enable)
. This is useful when compiling with-Os
tomanually enable vectorization or interleaving.
#pragma clang loop vectorize(enable)#pragma clang loop interleave(enable)for(...){...}
The vector width is specified byvectorize_width(_value_[,fixed|scalable])
, where _value_ is a positiveinteger and the type of vectorization can be specified with an optionalsecond parameter. The default for the second parameter is ‘fixed’ andrefers to fixed width vectorization, whereas ‘scalable’ indicates thecompiler should use scalable vectors instead. Another use of vectorize_widthisvectorize_width(fixed|scalable)
where the user can hint at the typeof vectorization to use without specifying the exact width. In both variantsof the pragma the vectorizer may decide to fall back on fixed widthvectorization if the target does not support scalable vectors.
The interleave count is specified byinterleave_count(_value_)
, where_value_ is a positive integer. This is useful for specifying the optimalwidth/count of the set of target architectures supported by your application.
#pragma clang loop vectorize_width(2)#pragma clang loop interleave_count(2)for(...){...}
Specifying a width/count of 1 disables the optimization, and is equivalent tovectorize(disable)
orinterleave(disable)
.
Vector predication is enabled byvectorize_predicate(enable)
, for example:
#pragma clang loop vectorize(enable)#pragma clang loop vectorize_predicate(enable)for(...){...}
This predicates (masks) all instructions in the loop, which allows the scalarremainder loop (the tail) to be folded into the main vectorized loop. Thismight be more efficient when vector predication is efficiently supported by thetarget platform.
Loop Unrolling¶
Unrolling a loop reduces the loop control overhead and exposes moreopportunities for ILP. Loops can be fully or partially unrolled. Full unrollingeliminates the loop and replaces it with an enumerated sequence of loopiterations. Full unrolling is only possible if the loop trip count is known atcompile time. Partial unrolling replicates the loop body within the loop andreduces the trip count.
Ifunroll(enable)
is specified the unroller will attempt to fully unroll theloop if the trip count is known at compile time. If the fully unrolled code sizeis greater than an internal limit the loop will be partially unrolled up to thislimit. If the trip count is not known at compile time the loop will be partiallyunrolled with a heuristically chosen unroll factor.
#pragma clang loop unroll(enable)for(...){...}
Ifunroll(full)
is specified the unroller will attempt to fully unroll theloop if the trip count is known at compile time identically tounroll(enable)
. However, withunroll(full)
the loop will not be unrolledif the loop count is not known at compile time.
#pragma clang loop unroll(full)for(...){...}
The unroll count can be specified explicitly withunroll_count(_value_)
where_value_ is a positive integer. If this value is greater than the trip count theloop will be fully unrolled. Otherwise the loop is partially unrolled subjectto the same code size limit as withunroll(enable)
.
#pragma clang loop unroll_count(8)for(...){...}
Unrolling of a loop can be prevented by specifyingunroll(disable)
.
Loop unroll parameters can be controlled by options-mllvm -unroll-count=n and-mllvm -pragma-unroll-threshold=n.
Loop Distribution¶
Loop Distribution allows splitting a loop into multiple loops. This isbeneficial for example when the entire loop cannot be vectorized but some of theresulting loops can.
Ifdistribute(enable))
is specified and the loop has memory dependenciesthat inhibit vectorization, the compiler will attempt to isolate the offendingoperations into a new loop. This optimization is not enabled by default, onlyloops marked with the pragma are considered.
#pragma clang loop distribute(enable)for(i=0;i<N;++i){S1:A[i+1]=A[i]+B[i];S2:C[i]=D[i]*E[i];}
This loop will be split into two loops between statements S1 and S2. Thesecond loop containing S2 will be vectorized.
Loop Distribution is currently not enabled by default in the optimizer becauseit can hurt performance in some cases. For example, instruction-levelparallelism could be reduced by sequentializing the execution of thestatements S1 and S2 above.
If Loop Distribution is turned on globally with-mllvm-enable-loop-distribution
, specifyingdistribute(disable)
canbe used the disable it on a per-loop basis.
Additional Information¶
For convenience multiple loop hints can be specified on a single line.
#pragma clang loop vectorize_width(4) interleave_count(8)for(...){...}
If an optimization cannot be applied any hints that apply to it will be ignored.For example, the hintvectorize_width(4)
is ignored if the loop is notproven safe to vectorize. To identify and diagnose optimization issues use-Rpass,-Rpass-missed, and-Rpass-analysis command line options. See theuser guide for details.
Extensions to specify floating-point flags¶
The#pragmaclangfp
pragma allows floating-point options to be specifiedfor a section of the source code. This pragma can only appear at file scope orat the start of a compound statement (excluding comments). When using within acompound statement, the pragma is active within the scope of the compoundstatement.
Currently, the following settings can be controlled with this pragma:
#pragmaclangfpreassociate
allows control over the reassociationof floating point expressions. When enabled, this pragma allows the expressionx+(y+z)
to be reassociated as(x+y)+z
.Reassociation can also occur across multiple statements.This pragma can be used to disable reassociation when it is otherwiseenabled for the translation unit with the-fassociative-math
flag.The pragma can take two values:on
andoff
.
floatf(floatx,floaty,floatz){// Enable floating point reassociation across statements#pragma clang fp reassociate(on)floatt=x+y;floatv=t+z;}
#pragmaclangfpreciprocal
allows control over using reciprocalapproximations in floating point expressions. When enabled, thispragma allows the expressionx/y
to be approximated asx*(1.0/y)
. This pragma can be used to disable reciprocalapproximation when it is otherwise enabled for the translation unitwith the-freciprocal-math
flag or other fast-math options. Thepragma can take two values:on
andoff
.
floatf(floatx,floaty){// Enable floating point reciprocal approximation#pragma clang fp reciprocal(on)returnx/y;}
#pragmaclangfpcontract
specifies whether the compiler shouldcontract a multiply and an addition (or subtraction) into a fused FMAoperation when supported by the target.
The pragma can take three values:on
,fast
andoff
. Theon
option is identical to using#pragmaSTDCFP_CONTRACT(ON)
and it allowsfusion as specified the language standard. Thefast
option allows fusionin cases when the language standard does not make this possible (e.g. acrossstatements in C).
for(...){#pragma clang fp contract(fast)a=b[i]*c[i];d[i]+=a;}
The pragma can also be used withoff
which turns FP contraction off for asection of the code. This can be useful when fast contraction is otherwiseenabled for the translation unit with the-ffp-contract=fast-honor-pragmas
flag.Note that-ffp-contract=fast
will override pragmas to fuse multiply andaddition across statements regardless of any controlling pragmas.
#pragmaclangfpexceptions
specifies floating point exception behavior. Itmay take one of the values:ignore
,maytrap
orstrict
. Meaning ofthese values is same as forconstrained floating point intrinsics.
{// Preserve floating point exceptions#pragma clang fp exceptions(strict)z=x+y;if(fetestexcept(FE_OVERFLOW))...}
A#pragmaclangfp
pragma may contain any number of options:
voidfunc(float*dest,floata,floatb){#pragma clang fp exceptions(maytrap) contract(fast) reassociate(on)...}
#pragmaclangfpeval_method
allows floating-point behavior to be specifiedfor a section of the source code. This pragma can appear at file or namespacescope, or at the start of a compound statement (excluding comments).The pragma is active within the scope of the compound statement.
Whenpragmaclangfpeval_method(source)
is enabled, the section of codegoverned by the pragma behaves as though the command-line option-ffp-eval-method=source
is enabled. Rounds intermediate results tosource-defined precision.
Whenpragmaclangfpeval_method(double)
is enabled, the section of codegoverned by the pragma behaves as though the command-line option-ffp-eval-method=double
is enabled. Rounds intermediate results todouble
precision.
Whenpragmaclangfpeval_method(extended)
is enabled, the section of codegoverned by the pragma behaves as though the command-line option-ffp-eval-method=extended
is enabled. Rounds intermediate results totarget-dependentlongdouble
precision. In Win32 programming, for instance,the long double data type maps to the double, 64-bit precision data type.
The full syntax this pragma supports is#pragmaclangfpeval_method(source|double|extended)
.
for(...){// The compiler will use long double as the floating-point evaluation// method.#pragma clang fp eval_method(extended)a=b[i]*c[i]+e;}
Note:math.h
defines the typedefsfloat_t
anddouble_t
based on the activeevaluation method at the point where the header is included, not where thetypedefs are used. Because of this, it is unwise to combine these typedefs with#pragmaclangfpeval_method
. To catch obvious bugs, Clang will emit anerror for any references to these typedefs within the scope of this pragma;however, this is not a fool-proof protection, and programmers must take care.
The#pragmafloat_control
pragma allows precise floating-pointsemantics and floating-point exception behavior to be specifiedfor a section of the source code. This pragma can only appear at file ornamespace scope, within a language linkage specification or at the start of acompound statement (excluding comments). When used within a compound statement,the pragma is active within the scope of the compound statement. This pragmais modeled after a Microsoft pragma with the same spelling and syntax. Forpragmas specified at file or namespace scope, or within a language linkagespecification, a stack is supported so that thepragmafloat_control
settings can be pushed or popped.
Whenpragmafloat_control(precise,on)
is enabled, the section of codegoverned by the pragma uses precise floating point semantics, effectively-ffast-math
is disabled and-ffp-contract=on
(fused multiply add) is enabled. This pragma enables-fmath-errno
.
Whenpragmafloat_control(precise,off)
is enabled, unsafe-floating pointoptimizations are enabled in the section of code governed by the pragma.Effectively-ffast-math
is enabled and-ffp-contract=fast
. This pragmadisables-fmath-errno
.
Whenpragmafloat_control(except,on)
is enabled, the section of codegoverned by the pragma behaves as though the command-line option-ffp-exception-behavior=strict
is enabled,whenpragmafloat_control(except,off)
is enabled, the section of codegoverned by the pragma behaves as though the command-line option-ffp-exception-behavior=ignore
is enabled.
The full syntax this pragma supports isfloat_control(except|precise,on|off[,push])
andfloat_control(push|pop)
.Thepush
andpop
forms, including usingpush
as the optionalthird argument, can only occur at file scope.
for(...){// This block will be compiled with -fno-fast-math and -ffp-contract=on#pragma float_control(precise, on)a=b[i]*c[i]+e;}
Extensions for controlling atomic code generation¶
The[[clang::atomic]]
statement attribute enables users to control howatomic operations are lowered in LLVM IR by conveying additional metadata tothe backend. The primary goal is to allow users to specify certain options,like whether the affected atomic operations might be used with specific types of memory orwhether to ignore denormal mode correctness in floating-point operations,without affecting the correctness of code that does not rely on these properties.
In LLVM, lowering of atomic operations (e.g.,atomicrmw
) can differ basedon the target’s capabilities. Some backends support native atomic instructionsonly for certain operation types or alignments, or only in specific memoryregions. Likewise, floating-point atomic instructions may or may not respectIEEE denormal requirements. When the user is unconcerned about denormal-modecompliance (for performance reasons) or knows that certain atomic operationswill not be performed on a particular type of memory, extra hints are needed totell the backend how to proceed.
A classic example is an architecture where floating-point atomic add does notfully conform to IEEE denormal-mode handling. If the user does not mind ignoringthat aspect, they would prefer to emit a faster hardware atomic instruction,rather than a fallback or CAS loop. Conversely, on certain GPUs (e.g., AMDGPU),memory accessed via PCIe may only support a subset of atomic operations. To ensurecorrect and efficient lowering, the compiler must know whether the user needsthe atomic operations to work with that type of memory.
The allowed atomic attribute values are nowremote_memory
,fine_grained_memory
,andignore_denormal_mode
, each optionally prefixed withno_
. The meaningsare as follows:
remote_memory
means atomic operations may be performed on remotememory, i.e. memory accessed through off-chip interconnects (e.g., PCIe).On ROCm platforms using HIP, remote memory refers to memory accessed viaPCIe and is subject to specific atomic operation support. SeeROCm PCIe Atomics for further details. Prefixing withno_remote_memory
indicates thatatomic operations should not be performed on remote memory.fine_grained_memory
means atomic operations may be performed on fine-grainedmemory, i.e. memory regions that support fine-grained coherence, where updates tomemory are visible to other parts of the system even while modifications are ongoing.For example, in HIP, fine-grained coherence ensures that host and device shareup-to-date data without explicit synchronization (seeHIP Definition).Similarly, OpenCL 2.0 provides fine-grained synchronization in shared virtual memoryallocations, allowing concurrent modifications by host and device (seeOpenCL 2.0 Overview).Prefixing withno_fine_grained_memory
indicates that atomic operations should notbe performed on fine-grained memory.ignore_denormal_mode
means that atomic operations are allowed to ignorecorrectness for denormal mode in floating-point operations, potentially improvingperformance on architectures that handle denormals inefficiently. The negated form,if specified asno_ignore_denormal_mode
, would enforce strict denormal modecorrectness.
Any unspecified option is inherited from the global defaults, which can be setby a compiler flag or the target’s built-in defaults.
Within the same atomic attribute, duplicate and conflicting values are accepted,and the last of any conflicting values wins. Multiple atomic attributes areallowed for the same compound statement, and the last atomic attribute wins.
Without any atomic metadata, LLVM IR defaults to conservative settings forcorrectness: atomic operations enforce denormal mode correctness and are assumedto potentially use remote and fine-grained memory (i.e., the equivalent ofremote_memory
,fine_grained_memory
, andno_ignore_denormal_mode
).
The attribute may be applied only to a compound statement and looks like:
[[clang::atomic(remote_memory,fine_grained_memory,ignore_denormal_mode)]]{// Atomic instructions in this block carry extra metadata reflecting// these user-specified options.}
A new compiler option now globally sets the defaults for these atomic-loweringoptions. The command-line format has changed to:
$clang-fatomic-remote-memory-fno-atomic-fine-grained-memory-fatomic-ignore-denormal-modefile.cpp
Each option has a corresponding flag:-fatomic-remote-memory
/-fno-atomic-remote-memory
,-fatomic-fine-grained-memory
/-fno-atomic-fine-grained-memory
,and-fatomic-ignore-denormal-mode
/-fno-atomic-ignore-denormal-mode
.
Code using the[[clang::atomic]]
attribute can then selectively overridethe command-line defaults on a per-block basis. For instance:
// Suppose the global defaults assume:// remote_memory, fine_grained_memory, and no_ignore_denormal_mode// (for conservative correctness)voidexample(){// Locally override the settings: disable remote_memory and enable// fine_grained_memory.[[clang::atomic(no_remote_memory,fine_grained_memory)]]{// In this block:// - Atomic operations are not performed on remote memory.// - Atomic operations are performed on fine-grained memory.// - The setting for denormal mode remains as the global default// (typically no_ignore_denormal_mode, enforcing strict denormal mode correctness).// ...}}
Function bodies do not accept statement attributes, so this will not work:
voidfunc()[[clang::atomic(remote_memory)]]{// Wrong: applies to function type}
Use the attribute on a compound statement within the function:
voidfunc(){[[clang::atomic(remote_memory)]]{// Atomic operations in this block carry the specified metadata.}}
The[[clang::atomic]]
attribute affects only the code generation of atomicinstructions within the annotated compound statement. Clang attaches target-specificmetadata to those atomic instructions in the emitted LLVM IR to guide backend lowering.This metadata is fixed at the Clang code generation phase and is not modified by laterLLVM passes (such as function inlining).
For example, consider:
inlinevoidfunc(){[[clang::atomic(remote_memory)]]{// Atomic instructions lowered with metadata.}}voidfoo(){[[clang::atomic(no_remote_memory)]]{func();// Inlined by LLVM, but the metadata from 'func()' remains unchanged.}}
Although current usage focuses on AMDGPU, the mechanism is general. Otherbackends can ignore or implement their own responses to these flags if desired.If a target does not understand or enforce these hints, the IR remains valid,and the resulting program is still correct (although potentially less optimizedfor that user’s needs).
Specifying an attribute for multiple declarations (#pragma clang attribute)¶
The#pragmaclangattribute
directive can be used to apply an attribute tomultiple declarations. The#pragmaclangattributepush
variation of thedirective pushes a new “scope” of#pragmaclangattribute
that attributescan be added to. The#pragmaclangattribute(...)
variation adds anattribute to that scope, and the#pragmaclangattributepop
variation popsthe scope. You can also use#pragmaclangattributepush(...)
, which is ashorthand for when you want to add one attribute to a new scope. Multiple pushdirectives can be nested inside each other.
The attributes that are used in the#pragmaclangattribute
directivescan be written using the GNU-style syntax:
#pragma clang attribute push (__attribute__((annotate("custom"))), apply_to = function)voidfunction();// The function now has the annotate("custom") attribute#pragma clang attribute pop
The attributes can also be written using the C++11 style syntax:
#pragma clang attribute push ([[noreturn]], apply_to = function)voidfunction();// The function now has the [[noreturn]] attribute#pragma clang attribute pop
The__declspec
style syntax is also supported:
#pragma clang attribute push (__declspec(dllexport), apply_to = function)voidfunction();// The function now has the __declspec(dllexport) attribute#pragma clang attribute pop
A single push directive can contain multiple attributes, however,only one syntax style can be used within a single directive:
#pragma clang attribute push ([[noreturn, noinline]], apply_to = function)voidfunction1();// The function now has the [[noreturn]] and [[noinline]] attributes#pragma clang attribute pop#pragma clang attribute push (__attribute((noreturn, noinline)), apply_to = function)voidfunction2();// The function now has the __attribute((noreturn)) and __attribute((noinline)) attributes#pragma clang attribute pop
Because multiple push directives can be nested, if you’re writing a macro thatexpands to_Pragma("clangattribute")
it’s good hygiene (though notrequired) to add a namespace to your push/pop directives. A pop directive with anamespace will pop the innermost push that has that same namespace. This willensure that another macro’spop
won’t inadvertently pop your attribute. Notethat anpop
without a namespace will pop the innermostpush
without anamespace.push``eswithanamespacecanonlybepoppedby``pop
with thesame namespace. For instance:
#define ASSUME_NORETURN_BEGIN _Pragma("clang attribute AssumeNoreturn.push ([[noreturn]], apply_to = function)")#define ASSUME_NORETURN_END _Pragma("clang attribute AssumeNoreturn.pop")#define ASSUME_UNAVAILABLE_BEGIN _Pragma("clang attribute Unavailable.push (__attribute__((unavailable)), apply_to=function)")#define ASSUME_UNAVAILABLE_END _Pragma("clang attribute Unavailable.pop")ASSUME_NORETURN_BEGINASSUME_UNAVAILABLE_BEGINvoidfunction();// function has [[noreturn]] and __attribute__((unavailable))ASSUME_NORETURN_ENDvoidother_function();// function has __attribute__((unavailable))ASSUME_UNAVAILABLE_END
Without the namespaces on the macros,other_function
will be annotated with[[noreturn]]
instead of__attribute__((unavailable))
. This may seem likea contrived example, but its very possible for this kind of situation to appearin real code if the pragmas are spread out across a large file. You can test ifyour version of clang supports namespaces on#pragmaclangattribute
with__has_extension(pragma_clang_attribute_namespaces)
.
Subject Match Rules¶
The set of declarations that receive a single attribute from the attribute stackdepends on the subject match rules that were specified in the pragma. Subjectmatch rules are specified after the attribute. The compiler expects anidentifier that corresponds to the subject set specifier. Theapply_to
specifier is currently the only supported subject set specifier. It allows youto specify match rules that form a subset of the attribute’s allowed subjectset, i.e. the compiler doesn’t require all of the attribute’s subjects. Forexample, an attribute like[[nodiscard]]
whose subject set includesenum
,record
andhasType(functionType)
, requires the presence of atleast one of these rules afterapply_to
:
#pragma clang attribute push([[nodiscard]], apply_to = enum)enumEnum1{A1,B1};// The enum will receive [[nodiscard]]structRecord1{};// The struct will *not* receive [[nodiscard]]#pragma clang attribute pop#pragma clang attribute push([[nodiscard]], apply_to = any(record, enum))enumEnum2{A2,B2};// The enum will receive [[nodiscard]]structRecord2{};// The struct *will* receive [[nodiscard]]#pragma clang attribute pop// This is an error, since [[nodiscard]] can't be applied to namespaces:#pragma clang attribute push([[nodiscard]], apply_to = any(record, namespace))#pragma clang attribute pop
Multiple match rules can be specified using theany
match rule, as shownin the example above. Theany
rule applies attributes to all declarationsthat are matched by at least one of the rules in theany
. It doesn’t nestand can’t be used inside the other match rules. Redundant match rules or rulesthat conflict with one another should not be used inside ofany
. Failing tospecify a rule within theany
rule results in an error.
Clang supports the following match rules:
function
: Can be used to apply attributes to functions. This includes C++member functions, static functions, operators, and constructors/destructors.function(is_member)
: Can be used to apply attributes to C++ memberfunctions. This includes members like static functions, operators, andconstructors/destructors.hasType(functionType)
: Can be used to apply attributes to functions, C++member functions, and variables/fields whose type is a function pointer. Itdoes not apply attributes to Objective-C methods or blocks.type_alias
: Can be used to apply attributes totypedef
declarationsand C++11 type aliases.record
: Can be used to apply attributes tostruct
,class
, andunion
declarations.record(unless(is_union))
: Can be used to apply attributes only tostruct
andclass
declarations.enum
: Can be used to apply attributes to enumeration declarations.enum_constant
: Can be used to apply attributes to enumerators.variable
: Can be used to apply attributes to variables, includinglocal variables, parameters, global variables, and static member variables.It does not apply attributes to instance member variables or Objective-Civars.variable(is_thread_local)
: Can be used to apply attributes to thread-localvariables only.variable(is_global)
: Can be used to apply attributes to global variablesonly.variable(is_local)
: Can be used to apply attributes to local variablesonly.variable(is_parameter)
: Can be used to apply attributes to parametersonly.variable(unless(is_parameter))
: Can be used to apply attributes to allthe variables that are not parameters.field
: Can be used to apply attributes to non-static member variablesin a record. This includes Objective-C ivars.namespace
: Can be used to apply attributes tonamespace
declarations.objc_interface
: Can be used to apply attributes to@interface
declarations.objc_protocol
: Can be used to apply attributes to@protocol
declarations.objc_category
: Can be used to apply attributes to category declarations,including class extensions.objc_method
: Can be used to apply attributes to Objective-C methods,including instance and class methods. Implicit methods like implicit propertygetters and setters do not receive the attribute.objc_method(is_instance)
: Can be used to apply attributes to Objective-Cinstance methods.objc_property
: Can be used to apply attributes to@property
declarations.block
: Can be used to apply attributes to block declarations. This doesnot include variables/fields of block pointer type.
The use ofunless
in match rules is currently restricted to a strict set ofsub-rules that are used by the supported attributes. That means that even thoughvariable(unless(is_parameter))
is a valid match rule,variable(unless(is_thread_local))
is not.
Supported Attributes¶
Not all attributes can be used with the#pragmaclangattribute
directive.Notably, statement attributes like[[fallthrough]]
or type attributeslikeaddress_space
aren’t supported by this directive. You can determinewhether or not an attribute is supported by the pragma by referring to theindividual documentation for that attribute.
The attributes are applied to all matching declarations individually, even whenthe attribute is semantically incorrect. The attributes that aren’t applied toany declaration are not verified semantically.
Specifying section names for global objects (#pragma clang section)¶
The#pragmaclangsection
directive provides a means to assign section-namesto global variables, functions and static variables.
The section names can be specified as:
#pragma clang section bss="myBSS" data="myData" rodata="myRodata" relro="myRelro" text="myText"
The section names can be reverted back to default name by supplying an emptystring to the section kind, for example:
#pragma clang section bss="" data="" text="" rodata="" relro=""
The#pragmaclangsection
directive obeys the following rules:
The pragma applies to all global variable, statics and function declarationsfrom the pragma to the end of the translation unit.
The pragma clang section is enabled automatically, without need of any flags.
This feature is only defined to work sensibly for ELF, Mach-O and COFF targets.
If section name is specified through _attribute_((section(“myname”))), thenthe attribute name gains precedence.
Global variables that are initialized to zero will be placed in the namedbss section, if one is present.
The
#pragmaclangsection
directive does not try to infer section-kindfrom the name. For example, naming a section “.bss.mySec
” does NOT meanit will be a bss section name.The decision about which section-kind applies to each global is taken in the back-end.Once the section-kind is known, appropriate section name, as specified by the user using
#pragmaclangsection
directive, is applied to that global.
Specifying Linker Options on ELF Targets¶
The#pragmacomment(lib,...)
directive is supported on all ELF targets.The second parameter is the library name (without the traditional Unix prefix oflib
). This allows you to provide an implicit link of dependent libraries.
Evaluating Object Size¶
Clang supports the builtins__builtin_object_size
and__builtin_dynamic_object_size
. The semantics are compatible with GCC’sbuiltins of the same names, but the details are slightly different.
size_t__builtin_[dynamic_]object_size(constvoid*ptr,inttype)
Returns the number of accessible bytesn
pastptr
. The value returneddepends ontype
, which is required to be an integer constant between 0 and3:
If
type&2==0
, the leastn
is returned such that accesses to(constchar*)ptr+n
and beyond are known to be out of bounds. This is(size_t)-1
if no better bound is known.If
type&2==2
, the greatestn
is returned such that accesses to(constchar*)ptr+i
are known to be in bounds, for 0 <=i
<n
.This is(size_t)0
if no better bound is known.
charsmall[10],large[100];boolcond;// Returns 100: writes of more than 100 bytes are known to be out of bounds.intn100=__builtin_object_size(cond?small:large,0);// Returns 10: writes of 10 or fewer bytes are known to be in bounds.intn10=__builtin_object_size(cond?small:large,2);
If
type&1==0
, pointers are considered to be in bounds if they pointinto the same storage asptr
– that is, the same stack object, globalvariable, or heap allocation.If
type&1==1
, pointers are considered to be in bounds if they pointto the same subobject thatptr
points to. Ifptr
points to an arrayelement, other elements of the same array, but not of enclosing arrays, areconsidered in bounds.
structX{chara,b,c;}x;static_assert(__builtin_object_size(&x,0)==3);static_assert(__builtin_object_size(&x.b,0)==2);static_assert(__builtin_object_size(&x.b,1)==1);
chara[10][10][10];static_assert(__builtin_object_size(&a,1)==1000);static_assert(__builtin_object_size(&a[1],1)==900);static_assert(__builtin_object_size(&a[1][1],1)==90);static_assert(__builtin_object_size(&a[1][1][1],1)==9);
The values returned by this builtin are a best effort conservative approximationof the correct answers. Whentype&2==0
, the true value is less than orequal to the value returned by the builtin, and whentype&2==1
, the truevalue is greater than or equal to the value returned by the builtin.
For__builtin_object_size
, the value is determined entirely at compile time.With optimization enabled, better results will be produced, especially when thecall to__builtin_object_size
is in a different function from the formationof the pointer. Unlike in GCC, enabling optimization in Clang does not allowmore information about subobjects to be determined, so thetype&1==1
case will often give imprecise results when used across a function call boundaryeven when optimization is enabled.
The pass_object_size and pass_dynamic_object_size attributescan be used to invisibly pass the object size for a pointer parameter alongsidethe pointer in a function call. This allows more precise object sizes to bedetermined both when building without optimizations and in thetype&1==1
case.
For__builtin_dynamic_object_size
, the result is not limited to being acompile time constant. Instead, a small amount of runtime evaluation ispermitted to determine the size of the object, in order to give a more preciseresult.__builtin_dynamic_object_size
is meant to be used as a drop-inreplacement for__builtin_object_size
in libraries that support it. Forinstance, here is a program that__builtin_dynamic_object_size
will makesafer:
voidcopy_into_buffer(size_tsize){char*buffer=malloc(size);strlcpy(buffer,"some string",strlen("some string"));// Previous line preprocesses to:// __builtin___strlcpy_chk(buffer, "some string", strlen("some string"), __builtin_object_size(buffer, 0))}
Since the size ofbuffer
can’t be known at compile time, Clang will fold__builtin_object_size(buffer,0)
into-1
. However, if this was writtenas__builtin_dynamic_object_size(buffer,0)
, Clang will fold it intosize
, providing some extra runtime safety.
Deprecating Macros¶
Clang supports the pragma#pragmaclangdeprecated
, which can be used toprovide deprecation warnings for macro uses. For example:
#define MIN(x, y) x < y ? x : y#pragma clang deprecated(MIN, "use std::min instead")intmin(inta,intb){returnMIN(a,b);// warning: MIN is deprecated: use std::min instead}
#pragmaclangdeprecated
should be preferred for this purpose over#pragmaGCCwarning
because the warning can be controlled with-Wdeprecated
.
Restricted Expansion Macros¶
Clang supports the pragma#pragmaclangrestrict_expansion
, which can beused restrict macro expansion in headers. This can be valuable when providingheaders with ABI stability requirements. Any expansion of the annotated macroprocessed by the preprocessor after the#pragma
annotation will log awarning. Redefining the macro or undefining the macro will not be diagnosed, norwill expansion of the macro within the main source file. For example:
#define TARGET_ARM 1#pragma clang restrict_expansion(TARGET_ARM, "<reason>")/// Foo.hstructFoo{#if TARGET_ARM// warning: TARGET_ARM is marked unsafe in headers: <reason>uint32_tX;#elseuint64_tX;#endif};/// main.c#include"foo.h"#if TARGET_ARM// No warning in main source fileX_TYPEuint32_t#elseX_TYPEuint64_t#endif
This warning is controlled by-Wpedantic-macros
.
Final Macros¶
Clang supports the pragma#pragmaclangfinal
, which can be used tomark macros as final, meaning they cannot be undef’d or re-defined. For example:
#define FINAL_MACRO 1#pragma clang final(FINAL_MACRO)#define FINAL_MACRO// warning: FINAL_MACRO is marked final and should not be redefined#undef FINAL_MACRO// warning: FINAL_MACRO is marked final and should not be undefined
This is useful for enforcing system-provided macros that should not be alteredin user headers or code. This is controlled by-Wpedantic-macros
. Finalmacros will always warn on redefinition, including situations with identicalbodies and in system headers.
Line Control¶
Clang supports an extension for source line control, which takes theform of a preprocessor directive starting with an unsigned integralconstant. In addition to the standard#line
directive, this formallows control of an include stack and header file type, which is usedin issuing diagnostics. These lines are emitted in preprocessedoutput.
# <line:number> <filename:string> <header-type:numbers>
The filename is optional, and if unspecified indicates no change insource filename. The header-type is an optional, whitespace-delimited,sequence of magic numbers as follows.
1:
Push the current source file name onto the include stack andenter a new file.2
: Pop the include stack and return to the specified file. Ifthe filename is""
, the name popped from the include stack isused. Otherwise there is no requirement that the specified filenamematches the current source when originally pushed.3
: Enter a system-header region. System headers often containimplementation-specific source that would normally emit a diagnostic.4
: Enter an implicitextern"C"
region. This is not required onmodern systems where system headers are C++-aware.
At most a single1
or2
can be present, and values must be inascending order.
Examples are:
# 57// Advance (or return) to line 57 of the current source file# 57 "frob"// Set to line 57 of "frob"# 1 "foo.h" 1// Enter "foo.h" at line 1# 59 "main.c" 2// Leave current include and return to "main.c"# 1 "/usr/include/stdio.h" 1 3// Enter a system header# 60 "" 2// return to "main.c"# 1 "/usr/ancient/header.h" 1 4// Enter an implicit extern "C" header
Intrinsics Support within Constant Expressions¶
The following builtin intrinsics can be used in constant expressions:
__builtin_addcb
__builtin_addcs
__builtin_addc
__builtin_addcl
__builtin_addcll
__builtin_bitreverse8
__builtin_bitreverse16
__builtin_bitreverse32
__builtin_bitreverse64
__builtin_bswap16
__builtin_bswap32
__builtin_bswap64
__builtin_clrsb
__builtin_clrsbl
__builtin_clrsbll
__builtin_clz
__builtin_clzl
__builtin_clzll
__builtin_clzs
__builtin_clzg
__builtin_ctz
__builtin_ctzl
__builtin_ctzll
__builtin_ctzs
__builtin_ctzg
__builtin_ffs
__builtin_ffsl
__builtin_ffsll
__builtin_fmax
__builtin_fmin
__builtin_fpclassify
__builtin_inf
__builtin_isinf
__builtin_isinf_sign
__builtin_isfinite
__builtin_isnan
__builtin_isnormal
__builtin_nan
__builtin_nans
__builtin_parity
__builtin_parityl
__builtin_parityll
__builtin_popcount
__builtin_popcountl
__builtin_popcountll
__builtin_popcountg
__builtin_rotateleft8
__builtin_rotateleft16
__builtin_rotateleft32
__builtin_rotateleft64
__builtin_rotateright8
__builtin_rotateright16
__builtin_rotateright32
__builtin_rotateright64
__builtin_subcb
__builtin_subcs
__builtin_subc
__builtin_subcl
__builtin_subcll
The following x86-specific intrinsics can be used in constant expressions:
_addcarry_u32
_addcarry_u64
_bit_scan_forward
_bit_scan_reverse
__bsfd
__bsfq
__bsrd
__bsrq
__bswap
__bswapd
__bswap64
__bswapq
_castf32_u32
_castf64_u64
_castu32_f32
_castu64_f64
__lzcnt16
__lzcnt
__lzcnt64
_mm_popcnt_u32
_mm_popcnt_u64
_popcnt32
_popcnt64
__popcntd
__popcntq
__popcnt16
__popcnt
__popcnt64
__rolb
__rolw
__rold
__rolq
__rorb
__rorw
__rord
__rorq
_rotl
_rotr
_rotwl
_rotwr
_lrotl
_lrotr
_subborrow_u32
_subborrow_u64
Debugging the Compiler¶
Clang supports a number of pragma directives that help debugging the compiler itself.Syntax is the following:#pragma clang __debug <command> <arguments>.Note, all of debugging pragmas are subject to change.
dump¶
Accepts either a single identifier or an expression. When a single identifier is passed,the lookup results for the identifier are printed tostderr. When an expression is passed,the AST for the expression is printed tostderr. The expression is an unevaluated operand,so things like overload resolution and template instantiations are performed,but the expression has no runtime effects.Type- and value-dependent expressions are not supported yet.
This facility is designed to aid with testing name lookup machinery.
Predefined Macros¶
__GCC_DESTRUCTIVE_SIZE and__GCC_CONSTRUCTIVE_SIZE¶
Specify the mimum offset between two objects to avoid false sharing and themaximum size of contiguous memory to promote true sharing, respectively. Thesemacros are predefined in all C and C++ language modes, but can be redefined onthe command line with-D
to specify different values as needed or can beundefined on the command line with-U
to disable support for the feature.
Note: the values the macros expand to are not guaranteed to be stable. Theyare are affected by architectures and CPU tuning flags, can change betweenreleases of Clang and will not match the values defined by other compilers suchas GCC.
Compiling different TUs depending on these flags (including use ofstd::hardware_constructive_interference
orstd::hardware_destructive_interference
) with different compilers, macrodefinitions, or architecture flags will lead to ODR violations and should beavoided.
#embed
Parameters¶
clang::offset
¶
Theclang::offset
embed parameter may appear zero or one time in theembed parameter sequence. Its preprocessor argument clause shall be present andhave the form:
..code-block: text
( constant-expression )
and shall be an integer constant expression. The integer constant expressionshall not evaluate to a value less than 0. The tokendefined
shall notappear within the constant expression.
The offset will be used when reading the contents of the embedded resource tospecify the starting offset to begin embedding from. The resources is treatedas being empty if the specified offset is larger than the number of bytes inthe resource. The offset will be appliedbefore anylimit
parameters areapplied.
Union and aggregate initialization in C¶
In C23 (N2900), when an object is initialized from initializer={}
, allelements of arrays, all members of structs, and the first members of unions areempty-initialized recursively. In addition, all padding bits are initialized tozero.
Clang guarantees the following behaviors:
1:
Clang supports initializer={}
mentioned above in all Cstandards.2:
When unions are initialized from initializer={}
, bytes outsideof the first members of unions are also initialized to zero.3:
When unions, structures and arrays are initialized from initializer={initializer-list}
, all members not explicitly initialized inthe initializer list are empty-initialized recursively. In addition, allpadding bits are initialized to zero.
Currently, the above extension only applies to C source code, not C++.
Empty Objects in C¶
The declaration of a structure or union type which has no named members isundefined behavior (C23 and earlier) or implementation-defined behavior (C2y).Clang allows the declaration of a structure or union type with no named membersin all C language modes.sizeof for such a type returns0, which isdifferent behavior than in C++ (where the size of such an object is typically1).
Qualified function types in C¶
Declaring a function with a qualified type in C is undefined behavior (C23 andearlier) or implementation-defined behavior (C2y). Clang allows a function typeto be specified with theconst
andvolatile
qualifiers, but ignores thequalifications.
typedefintf(void);constvolatileffunc;// Qualifier on function type has no effect.
Note, Clang does not allow an_Atomic
function type becauseof explicit constraints against atomically qualified (arrays and) functiontypes.
Underspecified Object Declarations in C¶
C23 introduced the notion ofunderspecified object declarations(note, the final standards text is different from WG14 N3006 due to changesduring national body comment review). When an object is declared with theconstexpr
storage class specifier or has a deduced type (with theauto
specifier), it is said to be “underspecified”. Underspecified declarations havedifferent requirements than non-underspecified declarations. In particular, theidentifier being declared cannot be used in its initialization. e.g.,
autox=x;// Invalidconstexprinty=y;// Invalid
The standard leaves it implementation-defined whether an underspecifieddeclaration may introduce additional identifiers as part of the declaration.
Clang allows additional identifiers to be declared in the following cases:
A compound literal may introduce a new type. e.g.,
autox=(structS{intx,y;}){1,2};// Accepted by Clangconstexprinti=(structT{intx;}){1}.x;// Accepted by Clang
The type specifier for a
constexpr
declaration may define a new type.e.g.,
constexprstructS{intx;}s={1};// Accepted by Clang
A function declarator may be declared with parameters, including parameterswhich introduce a new type. e.g.,
constexprint(*fp)(intx)=nullptr;// Accepted by Clangautof=(void(*)(structS{intx;}s))nullptr;// Accepted by Clang
The initializer may contain a GNU statement expression which defines newtypes or objects. e.g.,
constexprinti=({// Accepted by Clangconstexprintx=12;constexprstructS{intx;}s={x};s.x;});autox=({structS{intx;}s={0};s;});// Accepted by Clang
Clang intentionally does not implement the changed scoping rules from C23for underspecified declarations. Doing so would significantly complicate theimplementation in order to get reasonable diagnostic behavior and also meansClang fails to reject some code that should be rejected. e.g.,
// This should be rejected because 'x' is not in scope within the initializer// of an underspecified declaration. Clang accepts because it treats the scope// of the identifier as beginning immediately after the declarator, same as with// a non-underspecified declaration.constexprintx=sizeof(x);// Clang rejects this code with a diagnostic about using the variable within its// own initializer rather than rejecting the code with an undeclared identifier// diagnostic.autox=x;