Movatterモバイル変換


[0]ホーム

URL:


Next:, Previous:, Up:How to Use Inline Assembly Language in C Code   [Contents][Index]


6.12.2 Extended Asm - Assembler Instructions with C Expression Operands

With extendedasm you can read and write C variables from assembler and perform jumps from assembler code to C labels. Extendedasm syntax uses colons (‘:’) to delimitthe operand parameters after the assembler template:

asmasm-qualifiers (AssemblerTemplate                  :OutputOperands[ :InputOperands[ :Clobbers]])asmasm-qualifiers (AssemblerTemplate                       :OutputOperands                      :InputOperands                      :Clobbers                      :GotoLabels)

where in the last form,asm-qualifiers containsgoto (and in thefirst form, not).

Theasm keyword is a GNU extension.When writing code that can be compiled with-ansi and thevarious-std options, use__asm__ instead ofasm (seeAlternate Keywords).

Qualifiers

volatile

The typical use of extendedasm statements is to manipulate input values to produce output values. However, yourasm statements may also produce side effects. If so, you may need to use thevolatile qualifier to disable certain optimizations. SeeVolatile.

inline

If you use theinline qualifier, then for inlining purposes the sizeof theasm statement is taken as the smallest size possible(seeSize of anasm).

goto

This qualifier informs the compiler that theasm statement may perform a jump to one of the labels listed in theGotoLabels.SeeGotoLabels.

Parameters

AssemblerTemplate

This is a literal string that is the template for the assembler code. It is a combination of fixed text and tokens that refer to the input, output, and goto parameters. SeeAssemblerTemplate.

OutputOperands

A comma-separated list describing the C variables modified by theinstructions in theAssemblerTemplate. An empty list is permitted.SeeOutputOperands.

InputOperands

A comma-separated list describing the C expressions read by theinstructions in theAssemblerTemplate. An empty list is permitted.SeeInputOperands.

Clobbers

A comma-separated list of registers or other values changed by theAssemblerTemplate, beyond those listed as outputs.An empty list is permitted. SeeClobbers and Scratch Registers.

GotoLabels

When you are using thegoto form ofasm, this section contains the list of all C labels to which the code in theAssemblerTemplate may jump. SeeGotoLabels.

asm statements may not perform jumps into otherasm statements,only to the listedGotoLabels.GCC’s optimizers do not know about other jumps; therefore they cannot take account of them when deciding how to optimize.

The total number of input + output + goto operands is limited to 30.

Remarks

Theasm statement allows you to include assembly instructions directly within C code. This may help you to maximize performance in time-sensitive code or to access assembly instructions that are not readily available to C programs.

Similarly to basicasm, extendedasm statements may be usedboth inside a C function or at file scope (“top-level”), where you canuse this technique to emit assembler directives, define assembly languagemacros that can be invoked elsewhere in the file, or write entire functionsin assembly language.Extendedasm statements outside of functions may not use anyqualifiers, may not specify clobbers, may not use%,+ or& modifiers in constraints and can only use constraints which don’tallow using any register.

Functions declared with thenaked attribute require basicasm (seeDeclaring Attributes of Functions).

While the uses ofasm are many and varied, it may help to think of anasm statement as a series of low-level instructions that convert input parameters to output parameters. So a simple (if not particularly useful) example for i386 usingasm might look like this:

int src = 1;int dst;   asm ("mov %1, %0\n\t"    "add $1, %0"    : "=r" (dst)     : "r" (src));printf("%d\n", dst);

This code copiessrc todst and add 1 todst.

6.12.2.1 Volatile

GCC’s optimizers sometimes discardasm statements if they determine there is no need for the output variables. Also, the optimizers may move code out of loops if they believe that the code will always return the same result (i.e. none of its input values change between calls). Using thevolatile qualifier disables these optimizations.asm statements that have no output operands andasm goto statements, are implicitly volatile.

This i386 code demonstrates a case that does not use (or require) thevolatile qualifier. If it is performing assertion checking, this code usesasm to perform the validation. Otherwise,dwRes is unreferenced by any code. As a result, the optimizers can discard theasm statement, which in turn removes the need for the entireDoCheck routine. By omitting thevolatile qualifier when it isn’t needed you allow the optimizers to produce the most efficient code possible.

void DoCheck(uint32_t dwSomeValue){   uint32_t dwRes;   // Assumes dwSomeValue is not zero.   asm ("bsfl %1,%0"     : "=r" (dwRes)     : "r" (dwSomeValue)     : "cc");   assert(dwRes > 3);}

The next example shows a case where the optimizers can recognize that the input (dwSomeValue) never changes during the execution of the function and can therefore move theasm outside the loop to produce more efficient code. Again, using thevolatile qualifier disables this type of optimization.

void do_print(uint32_t dwSomeValue){   uint32_t dwRes;   for (uint32_t x=0; x < 5; x++)   {      // Assumes dwSomeValue is not zero.      asm ("bsfl %1,%0"        : "=r" (dwRes)        : "r" (dwSomeValue)        : "cc");      printf("%u: %u %u\n", x, dwSomeValue, dwRes);   }}

The following example demonstrates a case where you need to use thevolatile qualifier. It uses the x86rdtsc instruction, which reads the computer’s time-stamp counter. Without thevolatile qualifier, the optimizers might assume that theasm block will always return the same value and therefore optimize away the second call.

uint64_t msr;asm volatile ( "rdtsc\n\t"    // Returns the time in EDX:EAX.        "shl $32, %%rdx\n\t"  // Shift the upper bits left.        "or %%rdx, %0"        // 'Or' in the lower bits.        : "=a" (msr)        :         : "rdx");printf("msr: %llx\n", msr);// Do other work...// Reprint the timestampasm volatile ( "rdtsc\n\t"    // Returns the time in EDX:EAX.        "shl $32, %%rdx\n\t"  // Shift the upper bits left.        "or %%rdx, %0"        // 'Or' in the lower bits.        : "=a" (msr)        :         : "rdx");printf("msr: %llx\n", msr);

GCC’s optimizers do not treat this code like the non-volatile code in the earlier examples. They do not move it out of loops or omit it on the assumption that the result from a previous call is still valid.

Note that the compiler can move evenvolatile asm instructions relativeto other code, including across jump instructions. For example, on many targets there is a system register that controls the rounding mode of floating-point operations. Setting it with avolatile asm statement,as in the following PowerPC example, does not work reliably.

asm volatile("mtfsf 255, %0" : : "f" (fpenv));sum = x + y;

The compiler may move the addition back before thevolatile asmstatement. To make it work as expected, add an artificial dependency totheasm by referencing a variable in the subsequent code, forexample:

asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));sum = x + y;

Under certain circumstances, GCC may duplicate (or remove duplicates of) your assembly code when optimizing. This can lead to unexpected duplicate symbol errors during compilation if yourasm code defines symbols or labels. Using ‘%=’ (seeAssemblerTemplate) may help resolve this problem.

6.12.2.2 Assembler Template

An assembler template is a literal string containing assembler instructions.In C++ with-std=gnu++11 or later, the assembler template canalso be a constant expression inside parentheses (seeC++11 Constant Expressions instead of String Literals).

The compiler replaces tokens in the template that refer to inputs, outputs, and goto labels,and then outputs the resulting string to the assembler. The string can contain any instructions recognized by the assembler, including directives. GCC does not parse the assembler instructions themselves and does not know what they mean or even whether they are valid assembler input. However, it does count the statements (seeSize of anasm).

You may place multiple assembler instructions together in a singleasm string, separated by the characters normally used in assembly code for the system. A combination that works in most places is a newline to break the line, plus a tab character to move to the instruction field (written as ‘\n\t’). Some assemblers allow semicolons as a line separator. However, note that some assembler dialects use semicolons to start a comment.

Do not expect a sequence ofasm statements to remain perfectly consecutive after compilation, even when you are using thevolatile qualifier. If certain instructions need to remain consecutive in the output, put them in a single multi-instructionasm statement.

Accessing data from C programs without using input/output operands (such as by using global symbols directly from the assembler template) may not work as expected. Similarly, calling functions directly from an assembler template requires a detailed understanding of the target assembler and ABI.

Since GCC does not parse the assembler template,it has no visibility of any symbols it references. This may result in GCC discarding those symbols as unreferenced unless they are also listed as input, output, or goto operands.

Special format strings

In addition to the tokens described by the input, output, and goto operands, these tokens have special meanings in the assembler template:

%%

Outputs a single ‘%’ into the assembler code.

%=

Outputs a number that is unique to each instance of theasm statement in the entire compilation. This option is useful when creating local labels and referring to them multiple times in a single template that generates multiple assembler instructions.

%{
%|
%}

Outputs ‘{’, ‘|’, and ‘}’ characters (respectively)into the assembler code. When unescaped, these characters have specialmeaning to indicate multiple assembler dialects, as described below.

Multiple assembler dialects inasm templates

On targets such as x86, GCC supports multiple assembler dialects.The-masm option controls which dialect GCC uses as its default for inline assembler. The target-specific documentation for the-masm option contains the list of supported dialects, as well as the default dialect if the option is not specified. This information may be important to understand, since assembler code that works correctly when compiled using one dialect will likely fail if compiled using another.Seex86 Options.

If your code needs to support multiple assembler dialects (for example, if you are writing public headers that need to support a variety of compilation options), use constructs of this form:

{ dialect0 | dialect1 | dialect2... }

This construct outputsdialect0 when using dialect #0 to compile the code,dialect1 for dialect #1, etc. If there are fewer alternatives within the braces than the number of dialects the compiler supports, the construct outputs nothing.

For example, if an x86 compiler supports two dialects(‘att’, ‘intel’), an assembler template such as this:

"bt{l %[Offset],%[Base] | %[Base],%[Offset]}; jc %l2"

is equivalent to one of

"btl %[Offset],%[Base] ; jc %l2"/* att dialect */"bt %[Base],%[Offset]; jc %l2"/* intel dialect */

Using that same compiler, this code:

"xchg{l}\t{%%}ebx, %1"

corresponds to either

"xchgl\t%%ebx, %1"/* att dialect */"xchg\tebx, %1"/* intel dialect */

There is no support for nesting dialect alternatives.

6.12.2.3 Output Operands

Anasm statement has zero or more output operands indicating the namesof C variables modified by the assembler code.

In this i386 example,old (referred to in the template string as%0) and*Base (as%1) are outputs andOffset (%2) is an input:

bool old;__asm__ ("btsl %2,%1\n\t" // Turn on zero-based bit #Offset in Base.         "sbb %0,%0"      // Use the CF to calculate old.   : "=r" (old), "+rm" (*Base)   : "Ir" (Offset)   : "cc");return old;

Operands are separated by commas. Each operand has this format:

[ [asmSymbolicName]]constraint (cvariablename)
asmSymbolicName

Specifies an optional symbolic name for the operand. The literal squarebrackets ‘[]’ around theasmSymbolicName are required bothin the operand specification and references to the operand in the assemblertemplate, i.e. ‘%[Value]’.The scope of the name is theasm statementthat contains the definition. Any valid C identifier is acceptable,including names already defined in the surrounding code. No two operands within the sameasm statement can use the same symbolic name.

When not using anasmSymbolicName, use the (zero-based) positionof the operand in the list of operands in the assembler template. For example if there are three output operands, use ‘%0’ in the template to refer to the first, ‘%1’ for the second, and ‘%2’ for the third.

constraint

A string constant specifying constraints on the placement of the operand; SeeConstraints forasm Operands, for details.In C++ with-std=gnu++11 or later, the constraint canalso be a constant expression inside parentheses (seeC++11 Constant Expressions instead of String Literals).

Output constraints must begin with either ‘=’ (a variable overwriting an existing value) or ‘+’ (when reading and writing). When using ‘=’, do not assume the location contains the existing valueon entry to theasm, except when the operand is tied to an input; seeInput Operands.

After the prefix, there must be one or more additional constraints (seeConstraints forasm Operands) that describe where the value resides. Common constraints include ‘r’ for register and ‘m’ for memory. When you list more than one possible location (for example,"=rm"),the compiler chooses the most efficient one based on the current context. If you list as many alternates as theasm statement allows, you permit the optimizers to produce the best possible code. If you must use a specific register, but your Machine Constraints do notprovide sufficient control to select the specific register you want, local register variables may provide a solution (seeSpecifying Registers for Local Variables).

cvariablename

Specifies a C lvalue expression to hold the output, typically a variable name.The enclosing parentheses are a required part of the syntax.

When the compiler selects the registers to use to represent the output operands, it does not use any of the clobbered registers (seeClobbers and Scratch Registers).

Output operand expressions must be lvalues. The compiler cannot check whether the operands have data types that are reasonable for the instruction being executed. For output expressions that are not directly addressable (for example a bit-field), the constraint must allow a register. In that case, GCC uses the register as the output of theasm, and then stores that register into the output.

Operands using the ‘+’ constraint modifier count as two operands (that is, both as input and output) towards the total maximum of 30 operandsperasm statement.

Use the ‘&’ constraint modifier (seeConstraint Modifier Characters) on all outputoperands that must not overlap an input. Otherwise, GCC may allocate the output operand in the same register as an unrelated input operand, on the assumption that the assembler code consumes its inputs before producing outputs. This assumption may be false if the assembler code actually consists of more than one instruction.

The same problem can occur if one output parameter (a) allows a register constraint and another output parameter (b) allows a memory constraint.The code generated by GCC to access the memory address inb can containregisters whichmight be shared bya, and GCC considers those registers to be inputs to the asm. As above, GCC assumes that such inputregisters are consumed before any outputs are written. This assumption may result in incorrect behavior if theasm statement writes toabefore usingb. Combining the ‘&’ modifier with the register constraint onaensures that modifyinga does not affect the address referenced byb. Otherwise, the location ofb is undefined ifa is modified before usingb.

asm supports operand modifiers on operands (for example ‘%k2’ instead of simply ‘%2’).Generic Operand modifiers lists the modifiers that are availableon all targets. Other modifiers are hardware dependent.For example, the list of supported modifiers for x86 is found atx86 Operand modifiers.

If the C code that follows theasm makes no use of any of the output operands, usevolatile for theasm statement to prevent the optimizers from discarding theasm statement as unneeded (seeVolatile).

This code makes no use of the optionalasmSymbolicName. Therefore it references the first output operand as%0 (were there a second, it would be%1, etc). The number of the first input operand is one greater than that of the last output operand. In this i386 example, that makesMask referenced as%1:

uint32_t Mask = 1234;uint32_t Index;  asm ("bsfl %1, %0"     : "=r" (Index)     : "r" (Mask)     : "cc");

That code overwrites the variableIndex (‘=’),placing the value in a register (‘r’).Using the generic ‘r’ constraint instead of a constraint for a specific register allows the compiler to pick the register to use, which can result in more efficient code. This may not be possible if an assembler instruction requires a specific register.

The following i386 example uses theasmSymbolicName syntax.It produces the same result as the code above, but some may consider it more readable or more maintainable since reordering index numbers is not necessary when adding or removing operands. The namesaIndex andaMaskare only used in this example to emphasize which names get used where.It is acceptable to reuse the namesIndex andMask.

uint32_t Mask = 1234;uint32_t Index;  asm ("bsfl %[aMask], %[aIndex]"     : [aIndex] "=r" (Index)     : [aMask] "r" (Mask)     : "cc");

Here are some more examples of output operands.

uint32_t c = 1;uint32_t d;uint32_t *e = &c;asm ("mov %[e], %[d]"   : [d] "=rm" (d)   : [e] "rm" (*e));

Here,d may either be in a register or in memory. Since the compiler might already have the current value of theuint32_t locationpointed to byein a register, you can enable it to choose the best locationford by specifying both constraints.

6.12.2.4 Flag Output Operands

Some targets have a special register that holds the “flags” for theresult of an operation or comparison. Normally, the contents of thatregister are either unmodified by the asm, or theasm statement isconsidered to clobber the contents.

On some targets, a special form of output operand exists by whichconditions in the flags register may be outputs of the asm. The set ofconditions supported are target specific, but the general rule is thatthe output variable must be a scalar integer, and the value is boolean.When supported, the target defines the preprocessor symbol__GCC_ASM_FLAG_OUTPUTS__.

Because of the special nature of the flag output operands, the constraintmay not include alternatives.

Most often, the target has only one flags register, and thus is an impliedoperand of many instructions. In this case, the operand should not bereferenced within the assembler template via%0 etc, as there’sno corresponding text in the assembly language.

ARM
AArch64

The flag output constraints for the ARM family are of the form‘=@cccond’ wherecond is one of the standardconditions defined in the ARM ARM forConditionHolds.

eq

Z flag set, or equal

ne

Z flag clear or not equal

cs
hs

C flag set or unsigned greater than equal

cc
lo

C flag clear or unsigned less than

mi

N flag set or “minus”

pl

N flag clear or “plus”

vs

V flag set or signed overflow

vc

V flag clear

hi

unsigned greater than

ls

unsigned less than equal

ge

signed greater than equal

lt

signed less than

gt

signed greater than

le

signed less than equal

The flag output constraints are not supported in thumb1 mode.

x86 family

The flag output constraints for the x86 family are of the form‘=@cccond’ wherecond is one of the standardconditions defined in the ISA manual forjcc orsetcc.

a

“above” or unsigned greater than

ae

“above or equal” or unsigned greater than or equal

b

“below” or unsigned less than

be

“below or equal” or unsigned less than or equal

c

carry flag set

e
z

“equal” or zero flag set

g

signed greater than

ge

signed greater than or equal

l

signed less than

le

signed less than or equal

o

overflow flag set

p

parity flag set

s

sign flag set

na
nae
nb
nbe
nc
ne
ng
nge
nl
nle
no
np
ns
nz

“not”flag, or inverted versions of those above

s390

The flag output constraint for s390 is ‘=@cc’. Only one suchconstraint is allowed. The variable has to be stored in a ‘int’variable.

6.12.2.5 Input Operands

Input operands make values from C variables and expressions available to the assembly code.

Operands are separated by commas. Each operand has this format:

[ [asmSymbolicName]]constraint (cexpression)
asmSymbolicName

Specifies an optional symbolic name for the operand. The literal squarebrackets ‘[]’ around theasmSymbolicName are required bothin the operand specification and references to the operand in the assemblertemplate, i.e. ‘%[Value]’.The scope of the name is theasm statementthat contains the definition. Any valid C identifier is acceptable,including names already defined in the surrounding code. No two operands within the sameasm statement can use the same symbolic name.

When not using anasmSymbolicName, use the (zero-based) positionof the operand in the list of operands in the assembler template. For example if there aretwo output operands and three inputs,use ‘%2’ in the template to refer to the first input operand,‘%3’ for the second, and ‘%4’ for the third.

constraint

A string constant specifying constraints on the placement of the operand; SeeConstraints forasm Operands, for details.In C++ with-std=gnu++11 or later, the constraint canalso be a constant expression inside parentheses (seeC++11 Constant Expressions instead of String Literals).

Input constraint strings may not begin with either ‘=’ or ‘+’.When you list more than one possible location (for example, ‘"irm"’), the compiler chooses the most efficient one based on the current context.If you must use a specific register, but your Machine Constraints do notprovide sufficient control to select the specific register you want, local register variables may provide a solution (seeSpecifying Registers for Local Variables).

Input constraints can also be digits (for example,"0"). This indicates that the specified input must be in the same place as the output constraint at the (zero-based) index in the output constraint list. When usingasmSymbolicName syntax for the output operands,you may use these names (enclosed in brackets ‘[]’) instead of digits.

cexpression

This is the C variable or expression being passed to theasm statement as input. The enclosing parentheses are a required part of the syntax.

When the compiler selects the registers to use to represent the input operands, it does not use any of the clobbered registers(seeClobbers and Scratch Registers).

If there are no output operands but there are input operands, place two consecutive colons where the output operands would go:

__asm__ ("some instructions"   : /* No outputs. */   : "r" (Offset / 8));

Warning: Donot modify the contents of input-only operands (except for inputs tied to outputs). The compiler assumes that on exit from theasm statement these operands contain the same values as they had before executing the statement. It isnot possible to use clobbersto inform the compiler that the values in these inputs are changing. One common work-around is to tie the changing input variable to an output variable that never gets used. Note, however, that if the code that follows theasm statement makes no use of any of the output operands, the GCC optimizers may discard theasm statement as unneeded (seeVolatile).

asm supports operand modifiers on operands (for example ‘%k2’ instead of simply ‘%2’).Generic Operand modifiers lists the modifiers that are availableon all targets. Other modifiers are hardware dependent.For example, the list of supported modifiers for x86 is found atx86 Operand modifiers.

In this example using the fictitiouscombine instruction, the constraint"0" for input operand 1 says that it must occupy the same location as output operand 0. Only input operands may use numbers in constraints, and they must each refer to an output operand. Only a number (or the symbolic assembler name) in the constraint can guarantee that one operand is in the same place as another. The mere fact thatfoo is the value of both operands is not enough to guarantee that they are in the same place in the generated assembler code.

asm ("combine %2, %0"    : "=r" (foo)    : "0" (foo), "g" (bar));

Here is an example using symbolic names.

asm ("cmoveq %1, %2, %[result]"    : [result] "=r"(result)    : "r" (test), "r" (new), "[result]" (old));

6.12.2.6 Clobbers and Scratch Registers

While the compiler is aware of changes to entries listed in the output operands, the inlineasm code may modify more than just the outputs. For example, calculations may require additional registers, or the processor may overwrite a register as a side effect of a particular assembler instruction. In order to inform the compiler of these changes, list them in the clobber list. Clobber list items are either register names or the special clobbers (listed below). Each clobber list item is a string constant enclosed in double quotes and separated by commas.In C++ with-std=gnu++11 or later, a clobber list item canalso be a constant expression inside parentheses (seeC++11 Constant Expressions instead of String Literals).

Clobber descriptions may not in any way overlap with an input or output operand. For example, you may not have an operand describing a register class with one member when listing that register in the clobber list. Variables declared to live in specific registers (seeVariables in Specified Registers) and used asasm input or output operands must have no part mentioned in the clobber description. In particular, there is no way to specify that input operands get modified without also specifying them as output operands.

When the compiler selects which registers to use to represent input and output operands, it does not use any of the clobbered registers. As a result, clobbered registers are available for any use in the assembler code.

Another restriction is that the clobber list should not contain thestack pointer register. This is because the compiler requires thevalue of the stack pointer to be the same after anasmstatement as it was on entry to the statement. However, previousversions of GCC did not enforce this rule and allowed the stackpointer to appear in the list, with unclear semantics. This behavioris deprecated and listing the stack pointer may become an error infuture versions of GCC.

Here is a realistic example for the VAX showing the use of clobbered registers:

asm volatile ("movc3 %0, %1, %2"                   : /* No outputs. */                   : "g" (from), "g" (to), "g" (count)                   : "r0", "r1", "r2", "r3", "r4", "r5", "memory");

Also, there are three special clobber arguments:

"cc"

The"cc" clobber indicates that the assembler code modifies the flags register. On some machines, GCC represents the condition codes as a specific hardware register;"cc" serves to name this register.On other machines, condition code handling is different, and specifying"cc" has no effect. But it is valid no matter what the target.

"memory"

The"memory" clobber tells the compiler that the assembly codeperforms memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters). To ensure memory contains correct values, GCC may need to flush specific register values to memory before executing theasm. Further, the compiler does not assume that any values read from memory before anasm remain unchanged after thatasm; it reloads them as needed. Using the"memory" clobber effectively forms a read/writememory barrier for the compiler.

Note that this clobber does not prevent theprocessor from doing speculative reads past theasm statement. To prevent that, you need processor-specific fence instructions.

"redzone"

The"redzone" clobber tells the compiler that the assembly codemay write to the stack red zone, area below the stack pointer which onsome architectures in some calling conventions is guaranteed not to bechanged by signal handlers, interrupts or exceptions and so the compilercan store there temporaries in leaf functions. On targets which haveno concept of the stack red zone, the clobber is ignored.It should be used e.g. in case the assembly code uses call instructionsor pushes something to the stack without taking the red zone into accountby subtracting red zone size from the stack pointer first and restoringit afterwards.

Flushing registers to memory has performance implications and may bean issue for time-sensitive code. You can provide better informationto GCC to avoid this, as shown in the following examples. At aminimum, aliasing rules allow GCC to know what memorydoesn’tneed to be flushed.

Here is a fictitious sum of squares instruction, that takes twopointers to floating point values in memory and produces a floatingpoint register output.Notice thatx, andy both appear twice in theasmparameters, once to specify memory accessed, and once to specify abase register used by theasm. You won’t normally be wasting aregister by doing this as GCC can use the same register for bothpurposes. However, it would be foolish to use both%1 and%3 forx in thisasm and expect them to be thesame. In fact,%3 may well not be a register. It might be asymbolic memory reference to the object pointed to byx.

asm ("sumsq %0, %1, %2"     : "+f" (result)     : "r" (x), "r" (y), "m" (*x), "m" (*y));

Here is a fictitious*z++ = *x++ * *y++ instruction.Notice that thex,y andz pointer registersmust be specified as input/output because theasm modifiesthem.

asm ("vecmul %0, %1, %2"     : "+r" (z), "+r" (x), "+r" (y), "=m" (*z)     : "m" (*x), "m" (*y));

An x86 example where the string memory argument is of unknown length.

asm("repne scasb"    : "=c" (count), "+D" (p)    : "m" (*(const char (*)[]) p), "0" (-1), "a" (0));

If you know the above will only be reading a ten byte array then youcould instead use a memory input like:"m" (*(const char (*)[10]) p).

Here is an example of a PowerPC vector scale implemented in assembly,complete with vector and condition code clobbers, and some initializedoffset registers that are unchanged by theasm.

voiddscal (size_t n, double *x, double alpha){  asm ("/* lots of asm here */"       : "+m" (*(double (*)[n]) x), "+&r" (n), "+b" (x)       : "d" (alpha), "b" (32), "b" (48), "b" (64),         "b" (80), "b" (96), "b" (112)       : "cr0",         "vs32","vs33","vs34","vs35","vs36","vs37","vs38","vs39",         "vs40","vs41","vs42","vs43","vs44","vs45","vs46","vs47");}

Rather than allocating fixed registers via clobbers to provide scratchregisters for anasm statement, an alternative is to define avariable and make it an early-clobber output as witha2 anda3 in the example below. This gives the compiler registerallocator more freedom. You can also define a variable and make it anoutput tied to an input as witha0 anda1, tiedrespectively toap andlda. Of course, with tiedoutputs yourasm can’t use the input value after modifying theoutput register since they are one and the same register. What’smore, if you omit the early-clobber on the output, it is possible thatGCC might allocate the same register to another of the inputs if GCCcould prove they had the same value on entry to theasm. Thisis whya1 has an early-clobber. Its tied input,ldamight conceivably be known to have the value 16 and without anearly-clobber share the same register as%11. On the otherhand,ap can’t be the same as any of the other inputs, so anearly-clobber ona0 is not needed. It is also not desirable inthis case. An early-clobber ona0 would cause GCC to allocatea separate register for the"m" (*(const double (*)[]) ap)input. Note that tying an input to an output is the way to set up aninitialized temporary register modified by anasm statement.An input not tied to an output is assumed by GCC to be unchanged, forexample"b" (16) below sets up%11 to 16, and GCC mightuse that register in following code if the value 16 happened to beneeded. You can even use a normalasm output for a scratch ifall inputs that might share the same register are consumed before thescratch is used. The VSX registers clobbered by theasmstatement could have used this technique except for GCC’s limit on thenumber ofasm parameters.

static voiddgemv_kernel_4x4 (long n, const double *ap, long lda,                  const double *x, double *y, double alpha){  double *a0;  double *a1;  double *a2;  double *a3;  __asm__    (     /* lots of asm here */     "#n=%1 ap=%8=%12 lda=%13 x=%7=%10 y=%0=%2 alpha=%9 o16=%11\n"     "#a0=%3 a1=%4 a2=%5 a3=%6"     :       "+m" (*(double (*)[n]) y),       "+&r" (n),// 1       "+b" (y),// 2       "=b" (a0),// 3       "=&b" (a1),// 4       "=&b" (a2),// 5       "=&b" (a3)// 6     :       "m" (*(const double (*)[n]) x),       "m" (*(const double (*)[]) ap),       "d" (alpha),// 9       "r" (x),// 10       "b" (16),// 11       "3" (ap),// 12       "4" (lda)// 13     :       "cr0",       "vs32","vs33","vs34","vs35","vs36","vs37",       "vs40","vs41","vs42","vs43","vs44","vs45","vs46","vs47"     );}

6.12.2.7 Goto Labels

asm goto allows assembly code to jump to one or more C labels. TheGotoLabels section in anasm goto statement contains a comma-separated list of all C labels to which the assembler code may jump. GCC assumes thatasm execution falls through to the next statement (if this is not the case, consider using the__builtin_unreachable intrinsic after theasm statement). Optimization ofasm goto may be improved by using thehot andcold label attributes (seeLabel Attributes).

If the assembler code does modify anything, use the"memory" clobber to force the optimizers to flush all register values to memory and reload them if necessary after theasm statement.

Also note that anasm goto statement is always implicitlyconsidered volatile.

Be careful when you set output operands insideasm goto only onsome possible control flow paths. If you don’t set up the output ongiven path and never use it on this path, it is okay. Otherwise, youshould use ‘+’ constraint modifier meaning that the operand isinput and output one. With this modifier you will have the correctvalues on all possible paths from theasm goto.

To reference a label in the assembler template, prefix it with‘%l’ (lowercase ‘L’) followed by its (zero-based) positioninGotoLabels plus the number of input and output operands.Output operand with constraint modifier ‘+’ is counted as twooperands because it is considered as one output and one input operand.For example, if theasm has three inputs, one output operandwith constraint modifier ‘+’ and one output operand withconstraint modifier ‘=’ and references two labels, refer to thefirst label as ‘%l6’ and the second as ‘%l7’).

Alternately, you can reference labels using the actual C label nameenclosed in brackets. For example, to reference a label namedcarry, you can use ‘%l[carry]’. The label must still belisted in theGotoLabels section when using this approach. Itis better to use the named references for labels as in this case youcan avoid counting input and output operands and special treatment ofoutput operands with constraint modifier ‘+’.

Here is an example ofasm goto for i386:

asm goto (    "btl %1, %0\n\t"    "jc %l2"    : /* No outputs. */    : "r" (p1), "r" (p2)     : "cc"     : carry);return 0;carry:return 1;

The following example shows anasm goto that uses a memory clobber.

int frob(int x){  int y;  asm goto ("frob %%r5, %1; jc %l[error]; mov (%2), %%r5"            : /* No outputs. */            : "r"(x), "r"(&y)            : "r5", "memory"             : error);  return y;error:  return -1;}

The following example shows anasm goto that uses an output.

int foo(int count){  asm goto ("dec %0; jb %l[stop]"            : "+r" (count)            :            :            : stop);  return count;stop:  return 0;}

The following artificial example shows anasm goto that setsup an output only on one path inside theasm goto. Usage ofconstraint modifier ‘=’ instead of ‘+’ would be wrong asfactor is used on all paths from theasm goto.

int foo(int inp){  int factor = 0;  asm goto ("cmp %1, 10; jb %l[lab]; mov 2, %0"            : "+r" (factor)            : "r" (inp)            :            : lab);lab:  return inp * factor; /* return 2 * inp or 0 if inp < 10 */}

6.12.2.8 Generic Operand Modifiers

The following table shows the modifiers supported by all targets and their effects:

ModifierDescriptionExample
cRequire a constant operand and print the constant expression with no punctuation.%c0
ccLike ‘%c’ except try harder to print it with no punctuation.‘%c’ can e.g. fail to print constant addresses in position independent code onsome architectures.%cc0
nLike ‘%c’ except that the value of the constant is negated before printing.%n0
aSubstitute a memory reference, with the actual operand treated as the address.This may be useful when outputting a “load address” instruction, becauseoften the assembler syntax for such an instruction requires you to write theoperand as if it were a memory reference.%a0
lPrint the label name with no punctuation.%l0

6.12.2.9 AArch64 Operand Modifiers

The following table shows the modifiers supported by AArch64 and their effects:

ModifierDescription
wPrint a 32-bit general-purpose register name or, given aconstant zero operand, the 32-bit zero register (wzr).
xPrint a 64-bit general-purpose register name or, given aconstant zero operand, the 64-bit zero register (xzr).
bPrint an FP/SIMD register name with ab (byte, 8-bit)prefix.
hPrint an FP/SIMD register name with anh (halfword,16-bit) prefix.
sPrint an FP/SIMD register name with ans (singleword, 32-bit) prefix.
dPrint an FP/SIMD register name with ad (doubleword,64-bit) prefix.
qPrint an FP/SIMD register name with aq (quadword,128-bit) prefix.
ZPrint an FP/SIMD register name as an SVE register (i.e. withaz prefix). This is a no-op for SVE register operands.

6.12.2.10 x86 Operand Modifiers

References to input, output, and goto operands in the assembler templateof extendedasm statements can use modifiers to affect the way the operands are formatted in the code output to the assembler. For example, the following code uses the ‘h’ and ‘b’ modifiers for x86:

uint16_t  num;asm volatile ("xchg %h0, %b0" : "+a" (num) );

These modifiers generate this assembler code:

xchg %ah, %al

The rest of this discussion uses the following code for illustrative purposes.

int main(){   int iInt = 1;top:   asm volatile goto ("some assembler instructions here"   : /* No outputs. */   : "q" (iInt), "X" (sizeof(unsigned char) + 1), "i" (42)   : /* No clobbers. */   : top);}

With no modifiers, this is what the output from the operands would befor the ‘att’ and ‘intel’ dialects of assembler:

Operandattintel
%0%eaxeax
%1$22
%3$.L3OFFSET FLAT:.L3

The table below shows the list of supported modifiers and their effects.

ModifierDescriptionOperandattintel
APrint an absolute memory reference.%A0*%raxrax
bPrint the QImode name of the register.%b0%alal
cRequire a constant operand and print the constant expression with no punctuation.%c122
EPrint the address in Double Integer (DImode) mode (8 bytes) when the target is 64-bit.Otherwise mode is unspecified (VOIDmode).%E1%(rax)[rax]
hPrint the QImode name for a “high” register.%h0%ahah
HAdd 8 bytes to an offsettable memory reference. Useful when accessing thehigh 8 bytes of SSE values. For a memref in (%rax), it generates%H08(%rax)8[rax]
kPrint the SImode name of the register.%k0%eaxeax
lPrint the label name with no punctuation.%l3.L3.L3
pPrint raw symbol name (without syntax-specific prefixes).%p24242
PIf used for a function, print the PLT suffix and generate PIC code.For example, emitfoo@PLT instead of ’foo’ for the functionfoo(). If used for a constant, drop all syntax-specific prefixes andissue the bare constant. Seep above.
qPrint the DImode name of the register.%q0%raxrax
wPrint the HImode name of the register.%w0%axax
zPrint the opcode suffix for the size of the current integer operand (one ofb/w/l/q).%z0l

V is a special modifier which prints the name of the full integerregister without%.

6.12.2.11 x86 Floating-Pointasm Operands

On x86 targets, there are several rules on the usage of stack-like registersin the operands of anasm. These rules apply only to the operandsthat are stack-like registers:

  1. Given a set of input registers that die in anasm, it isnecessary to know which are implicitly popped by theasm, andwhich must be explicitly popped by GCC.

    An input register that is implicitly popped by theasm must beexplicitly clobbered, unless it is constrained to match anoutput operand.

  2. For any input register that is implicitly popped by anasm, it isnecessary to know how to adjust the stack to compensate for the pop.If any non-popped input is closer to the top of the reg-stack thanthe implicitly popped register, it would not be possible to know what thestack looked like—it’s not clear how the rest of the stack “slidesup”.

    All implicitly popped input registers must be closer to the top ofthe reg-stack than any input that is not implicitly popped.

    It is possible that if an input dies in anasm, the compiler mightuse the input register for an output reload. Consider this example:

    asm ("foo" : "=t" (a) : "f" (b));

    This code says that inputb is not popped by theasm, and thattheasm pushes a result onto the reg-stack, i.e., the stack is onedeeper after theasm than it was before. But, it is possible thatreload may think that it can use the same register for both the input andthe output.

    To prevent this from happening,if any input operand uses the ‘f’ constraint, all output registerconstraints must use the ‘&’ early-clobber modifier.

    The example above is correctly written as:

    asm ("foo" : "=&t" (a) : "f" (b));
  3. Some operands need to be in particular places on the stack. Alloutput operands fall in this category—GCC has no other way toknow which registers the outputs appear in unless you indicatethis in the constraints.

    Output operands must specifically indicate which register an outputappears in after anasm. ‘=f’ is not allowed: the operandconstraints must select a class with a single register.

  4. Output operands may not be “inserted” between existing stack registers.Since no 387 opcode uses a read/write operand, all output operandsare dead before theasm, and are pushed by theasm.It makes no sense to push anywhere but the top of the reg-stack.

    Output operands must start at the top of the reg-stack: outputoperands may not “skip” a register.

  5. Someasm statements may need extra stack space for internalcalculations. This can be guaranteed by clobbering stack registersunrelated to the inputs and outputs.

Thisasmtakes one input, which is internally popped, and produces two outputs.

asm ("fsincos" : "=t" (cos), "=u" (sin) : "0" (inp));

Thisasm takes two inputs, which are popped by thefyl2xp1 opcode,and replaces them with one output. Thest(1) clobber is necessary for the compiler to know thatfyl2xp1 pops both inputs.

asm ("fyl2xp1" : "=t" (result) : "0" (x), "u" (y) : "st(1)");

6.12.2.12 MSP430 Operand Modifiers

The list below describes the supported modifiers and their effects for MSP430.

ModifierDescription
ASelect low 16-bits of the constant/register/memory operand.
BSelect high 16-bits of the constant/register/memoryoperand.
CSelect bits 32-47 of the constant/register/memory operand.
DSelect bits 48-63 of the constant/register/memory operand.
HEquivalent toB (for backwards compatibility).
IPrint the inverse (logicalNOT) of the constantvalue.
JPrint an integer without a# prefix.
LEquivalent toA (for backwards compatibility).
OOffset of the current frame from the top of the stack.
QUse theA instruction postfix.
RInverse of condition code, for unsigned comparisons.
WSubtract 16 from the constant value.
XUse theX instruction postfix.
YSubtract 4 from the constant value.
ZSubtract 1 from the constant value.
bAppend.B,.W or.A to theinstruction, depending on the mode.
dOffset 1 byte of a memory reference or constant value.
eOffset 3 bytes of a memory reference or constant value.
fOffset 5 bytes of a memory reference or constant value.
gOffset 7 bytes of a memory reference or constant value.
pPrint the value of 2, raised to the power of the givenconstant. Used to select the specified bit position.
rInverse of condition code, for signed comparisons.
xEquivalent toX, but only for pointers.

6.12.2.13 LoongArch Operand Modifiers

The list below describes the supported modifiers and their effects for LoongArch.

ModifierDescription
dSame asc.
iPrint the character ”i” if the operand is not a register.
mSame asc, but the printed value isoperand - 1.
uPrint a LASX register.
wPrint a LSX register.
XPrint a constant integer operand in hexadecimal.
zPrint the operand in its unmodified form, followed by a comma.

References to input and output operands in the assembler template of extendedasm statements can use modifiers to affect the way the operands are formattedin the code output to the assembler. For example, the following code uses the’w’ modifier for LoongArch:

test-asm.c:#include <lsxintrin.h>__m128i foo (void){__m128i  a,b,c;__asm__ ("vadd.d %w0,%w1,%w2\n\t"   :"=f" (c)   :"f" (a),"f" (b));return c;}

The compile command for the test case is as follows:

gcc test-asm.c -mlsx -S -o test-asm.s

The assembly statement produces the following assembly code:

vadd.d $vr0,$vr0,$vr1

This is a 128-bit vector addition instruction,c (referred to in thetemplate string as %0) is the output, anda (%1) andb (%2) arethe inputs.__m128i is a vector data type defined in the filelsxintrin.h (SeeLoongArch SX Vector Intrinsics). The symbol ’=f’represents a constraint using a floating-point register as an output type, andthe ’f’ in the input operand represents a constraint using a floating-pointregister operand, which can refer to the definition of a constraint(SeeConstraints forasm Operands) in gcc.

6.12.2.14 RISC-V Operand Modifiers

The list below describes the supported modifiers and their effects for RISC-V.

ModifierDescription
zPrint ”zero” instead of 0 if the operand is an immediate with a value of zero.
iPrint the character ”i” if the operand is an immediate.
NPrint the register encoding as integer (0 - 31).
HPrint the name of the next register for integer.

6.12.2.15 SH Operand Modifiers

The list below describes the supported modifiers and their effects for the SH family of processors.

ModifierDescription
.Print ”.s” if the instruction needs a delay slot.
,Print ”LOCAL_LABEL_PREFIX”.
@Print ”trap”, ”rte” or ”rts” depending on the interrupt pragma used.
#Print ”nop” if there is nothing to put in the delay slot.
'Print likelihood suffix (”/u” for unlikely).
>Print branch target if ”-fverbose-asm”.
ORequire a constant operand and print the constant expression with no punctuation.
RPrint the ”LSW” of a dp value - changes if in little endian.
SPrint the ”MSW” of a dp value - changes if in little endian.
TPrint the next word of a dp value - same as ”R” in big endian mode.
MPrint ”.b”, ”.w”, ”.l”, ”.s”, ”.d”, suffix if operand is a MEM.
NPrint ”r63” if the operand is ”const_int 0”.
dPrint a ”V2SF” as ”dN” instead of ”fpN”.
mPrint the pair ”base,offset” or ”base,index” for LD and ST.
ULike ”%m” for ”LD” and ”ST”, ”HI” and ”LO”.
VPrint the position of a single bit set.
WPrint the position of a single bit cleared.
tPrint a memory address which is a register.
uPrint the lowest 16 bits of ”CONST_INT”, as an unsigned value.
oPrint an operator.

Next:Constraints forasm Operands, Previous:Basic Asm — Assembler Instructions Without Operands, Up:How to Use Inline Assembly Language in C Code   [Contents][Index]


[8]ページ先頭

©2009-2025 Movatter.jp