Movatterモバイル変換

Next: Offsetof,Previous: Return Address,Up: C Extensions

6.49 Using Vector Instructions through Built-in Functions

On some targets, the instruction set contains SIMD vector instructions whichoperate on multiple values contained in one large register at the same time. For example, on the i386 the MMX, 3DNow! and SSE extensions can be usedthis way.

The first step in using these extensions is to provide the necessary datatypes. This should be done using an appropriatetypedef:

     typedef int v4si __attribute__ ((vector_size (16)));

Theint type specifies the base type, while the attribute specifiesthe vector size for the variable, measured in bytes. For example, thedeclaration above causes the compiler to set the mode for thev4sitype to be 16 bytes wide and divided intoint sized units. Fora 32-bitint this means a vector of 4 units of 4 bytes, and thecorresponding mode offoo isV4SI.

Thevector_size attribute is only applicable to integral andfloat scalars, although arrays, pointers, and function return valuesare allowed in conjunction with this construct. Only sizes that area power of two are currently allowed.

All the basic integer types can be used as base types, both as signedand as unsigned:char,short,int,long,long long. In addition,float anddouble can beused to build floating-point vector types.

Specifying a combination that is not valid for the current architecturecauses GCC to synthesize the instructions using a narrower mode. For example, if you specify a variable of typeV4SI and yourarchitecture does not allow for this specific SIMD type, GCCproduces code that uses 4SIs.

The types defined in this manner can be used with a subset of normal Coperations. Currently, GCC allows using the following operatorson these types:+, -, *, /, unary minus, ^, |, &, ~, %.

The operations behave like C++valarrays. Addition is defined asthe addition of the corresponding elements of the operands. Forexample, in the code below, each of the 4 elements ina isadded to the corresponding 4 elements inb and the resultingvector is stored inc.

     typedef int v4si __attribute__ ((vector_size (16)));          v4si a, b, c;          c = a + b;

Subtraction, multiplication, division, and the logical operationsoperate in a similar manner. Likewise, the result of using the unaryminus or complement operators on a vector type is a vector whoseelements are the negative or complemented values of the correspondingelements in the operand.

It is possible to use shifting operators<<,>> oninteger-type vectors. The operation is defined as following:{a0,a1, ..., an} >> {b0, b1, ..., bn} == {a0 >> b0, a1 >> b1,..., an >> bn}. Vector operands must have the same number ofelements.

For convenience, it is allowed to use a binary vector operationwhere one operand is a scalar. In that case the compiler transformsthe scalar operand into a vector where each element is the scalar fromthe operation. The transformation happens only if the scalar could besafely converted to the vector-element type. Consider the following code.

     typedef int v4si __attribute__ ((vector_size (16)));          v4si a, b, c;     long l;          a = b + 1;    /* a = b + {1,1,1,1}; */     a = 2 * b;    /* a = {2,2,2,2} * b; */          a = l + a;    /* Error, cannot convert long to int. */

Vectors can be subscripted as if the vector were an array withthe same number of elements and base type. Out of bound accessesinvoke undefined behavior at run time. Warnings for out of boundaccesses for vector subscription can be enabled with-Warray-bounds.

Vector comparison is supported with standard comparisonoperators:==, !=, <, <=, >, >=. Comparison operands can bevector expressions of integer-type or real-type. Comparison betweeninteger-type vectors and real-type vectors are not supported. Theresult of the comparison is a vector of the same width and number ofelements as the comparison operands with a signed integral elementtype.

Vectors are compared element-wise producing 0 when comparison is falseand -1 (constant of the appropriate type where all bits are set)otherwise. Consider the following example.

     typedef int v4si __attribute__ ((vector_size (16)));          v4si a = {1,2,3,4};     v4si b = {3,2,1,4};     v4si c;          c = a >  b;     /* The result would be {0, 0,-1, 0}  */     c = a == b;     /* The result would be {0,-1, 0,-1}  */

Vector shuffling is available using functions__builtin_shuffle (vec, mask) and__builtin_shuffle (vec0, vec1, mask). Both functions construct a permutation of elements from one or twovectors and return a vector of the same type as the input vector(s). Themask is an integral vector with the same width (W)and element count (N) as the output vector.

The elements of the input vectors are numbered in memory ordering ofvec0 beginning at 0 andvec1 beginning atN. Theelements ofmask are considered moduloN in the single-operandcase and modulo 2*N in the two-operand case.

Consider the following example,

     typedef int v4si __attribute__ ((vector_size (16)));          v4si a = {1,2,3,4};     v4si b = {5,6,7,8};     v4si mask1 = {0,1,1,3};     v4si mask2 = {0,4,2,5};     v4si res;          res = __builtin_shuffle (a, mask1);       /* res is {1,2,2,4}  */     res = __builtin_shuffle (a, b, mask2);    /* res is {1,5,3,6}  */

Note that__builtin_shuffle is intentionally semanticallycompatible with the OpenCLshuffle andshuffle2 functions.

You can declare variables and use them in function calls and returns, aswell as in assignments and some casts. You can specify a vector type asa return type for a function. Vector types can also be used as functionarguments. It is possible to cast from one vector type to another,provided they are of the same size (in fact, you can also cast vectorsto and from other datatypes of the same size).

You cannot operate between vectors of different lengths or differentsignedness without a cast.

[8]ページ先頭