
D, being a systems programming language, provides an inline assembler. The inline assembler is standardized for D implementations across the same CPU family, for example, the Intel Pentium inline assembler for a Win32 D compiler will be syntax compatible with the inline assembler for Linux running on an Intel Pentium.
Implementations of D on different architectures, however, are free to innovate upon the memory model, function call/return conventions, argument passing conventions, etc.
This document describes thex86 andx86_64 implementations of the inline assembler. The inline assembler platform support that a compiler provides is indicated by theD_InlineAsm_X86 andD_InlineAsm_X86_64 version identifiers, respectively.
AsmStatement:asmFunctionAttributesopt{AsmInstructionListopt}AsmInstructionList:AsmInstruction;AsmInstruction;AsmInstructionList
Assembler instructions must be located inside anasm block. Like functions,asm statements must be anotated with adequate function attributes to be compatible with the caller. Asm statements attributes must be explicitly defined, they are not inferred.
@safe is not allowed as an attribute, as the compiler does no safety checking of assembly statements - use@trusted instead.
void ok()purenothrow @safe @nogc{asmpurenothrow @trusted @nogc {}}void error() @safe @nogc{asm @nogc// Error: asm statement is assumed to be @system - mark it with '@trusted' if it is not {}asm @safe @nogc// Deprecation: asm statement cannot be @safe, use @trusted instead {}}
AsmInstruction:Identifier:AsmInstructionalignIntegerExpressionevennakeddbOperandsdsOperandsdiOperandsdlOperandsdfOperandsddOperandsdeOperandsdbStringLiteraldsStringLiteraldiStringLiteraldlStringLiteraldwStringLiteraldqStringLiteralOpcodeOpcodeOperandsOpcode:IdentifierintinoutOperands:OperandOperand,Operands
Assembler instructions can be labeled just like other statements. They can be the target of goto statements. For example:
void *pc;asm{ call L1 ; L1: ; pop EBX ; mov pc[EBP],EBX ;// pc now points to code at L1}
IntegerExpression:IntegerLiteralIdentifier
Causes the assembler to emit NOP instructions to align the next assembler instruction on anIntegerExpression boundary.IntegerExpression must evaluate at compile time to an integer that is a power of 2.
Aligning the start of a loop body can sometimes have a dramatic effect on the execution speed.
Causes the assembler to emit NOP instructions to align the next assembler instruction on an even boundary.
Causes the compiler to not generate the function prolog and epilog sequences. This means such is the responsibility of inline assembly programmer, and is normally used when the entire function is to be written in assembler.
These pseudo ops are for inserting raw data directly into the code.db is for bytes,ds is for 16 bit words,di is for 32 bit words,dl is for 64 bit words,df is for 32 bit floats,dd is for 64 bit doubles, andde is for 80 bit extended reals. Each can have multiple operands. If an operand is a string literal, it is as if there werelength operands, wherelength is the number of characters in the string. One character is used per operand. For example:
asm{ db 5,6,0x83;// insert bytes 0x05, 0x06, and 0x83 into code ds 0x1234;// insert bytes 0x34, 0x12 di 0x1234;// insert bytes 0x34, 0x12, 0x00, 0x00 dl 0x1234;// insert bytes 0x34, 0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 df 1.234;// insert float 1.234 dd 1.234;// insert double 1.234 de 1.234;// insert real 1.234 db"abc";// insert bytes 0x61, 0x62, and 0x63 ds"abc";// insert bytes 0x61, 0x00, 0x62, 0x00, 0x63, 0x00}
A list of supported opcodes is at the end.
The following registers are supported. Register names are always in upper case.
Register:ALAHAXEAXBLBHBXEBXCLCHCXECXDLDHDXEDXBPEBPSPESPDIEDISIESIESCSSSDSGSFSCR0CR2CR3CR4DR0DR1DR2DR3DR6DR7TR3TR4TR5TR6TR7STST(0)ST(1)ST(2)ST(3)ST(4)ST(5)ST(6)ST(7)MM0MM1MM2MM3MM4MM5MM6MM7XMM0XMM1XMM2XMM3XMM4XMM5XMM6XMM7
x86_64 adds these additional registers.
Register64:RAXRBXRCXRDXBPLRBPSPLRSPDILRDISILRSIR8BR8WR8DR8R9BR9WR9DR9R10BR10WR10DR10R11BR11WR11DR11R12BR12WR12DR12R13BR13WR13DR13R14BR14WR14DR14R15BR15WR15DR15XMM8XMM9XMM10XMM11XMM12XMM13XMM14XMM15YMM0YMM1YMM2YMM3YMM4YMM5YMM6YMM7YMM8YMM9YMM10YMM11YMM12YMM13YMM14YMM15
asm{ rep ; movsb ;}asm{ rep ; nop ;} which produces the same result.fdiv ST(1);// wrongfmul ST;// wrongfdiv ST,ST(1);// rightfmul ST,ST(0);// right
Operand:AsmExpAsmExp:AsmLogOrExpAsmLogOrExp?AsmExp:AsmExpAsmLogOrExp:AsmLogAndExpAsmLogOrExp||AsmLogAndExpAsmLogAndExp:AsmOrExpAsmLogAndExp&&AsmOrExpAsmOrExp:AsmXorExpAsmOrExp|AsmXorExpAsmXorExp:AsmAndExpAsmXorExp^AsmAndExpAsmAndExp:AsmEqualExpAsmAndExp&AsmEqualExpAsmEqualExp:AsmRelExpAsmEqualExp==AsmRelExpAsmEqualExp!=AsmRelExpAsmRelExp:AsmShiftExpAsmRelExp<AsmShiftExpAsmRelExp<=AsmShiftExpAsmRelExp>AsmShiftExpAsmRelExp>=AsmShiftExpAsmShiftExp:AsmAddExpAsmShiftExp<<AsmAddExpAsmShiftExp>>AsmAddExpAsmShiftExp>>>AsmAddExpAsmAddExp:AsmMulExpAsmAddExp+AsmMulExpAsmAddExp-AsmMulExpAsmMulExp:AsmBrExpAsmMulExp*AsmBrExpAsmMulExp/AsmBrExpAsmMulExp%AsmBrExpAsmBrExp:AsmUnaExpAsmBrExp[AsmExp]AsmUnaExp:AsmTypePrefixAsmExpoffsetofAsmExpsegAsmExp+AsmUnaExp-AsmUnaExp!AsmUnaExp~AsmUnaExpAsmPrimaryExpAsmPrimaryExp:IntegerLiteralFloatLiteral__LOCAL_SIZE$RegisterRegister:AsmExpRegister64Register64:AsmExpDotIdentifierthisDotIdentifier:IdentifierIdentifier.DotIdentifierFundamentalType.Identifier
The operand syntax more or less follows the Intel CPU documentation conventions. In particular, the convention is that for two operand instructions the source is the right operand and the destination is the left operand. The syntax differs from that of Intel's in order to be compatible with the D language tokenizer and to simplify parsing.
Theseg means load the segment number that the symbol is in. This is not relevant for flat model code. Instead, do a move from the relevant segment register.
A dotted expression is evaluated during the compilation and then must either give a constant or indicate a higher level variable that fits in the target register or variable.
AsmTypePrefix:near ptrfar ptrword ptrdword ptrqword ptrFundamentalTypeptr
In cases where the operand size is ambiguous, as in:
add [EAX],3 ;it can be disambiguated by using anAsmTypePrefix:
addbyte ptr [EAX],3 ;addint ptr [EAX],7 ;
far ptr is not relevant for flat model code.
To access members of an aggregate, given a pointer to the aggregate is in a register, use the.offsetof property of the qualified name of the member:
struct Foo {int a,b,c; }int bar(Foo *f){asm { mov EBX,f ; mov EAX,Foo.b.offsetof[EBX] ; }}void main(){ Foo f = Foo(0, 2, 0);assert(bar(&f) == 2);}
Alternatively, inside the scope of an aggregate, only the member name is needed:
struct Foo// or class{int a,b,c;int bar() {asm { mov EBX,this ; mov EAX, b[EBX] ; } }}void main(){ Foo f = Foo(0, 2, 0);assert(f.bar() == 2);}
Stack variables (variables local to a function and allocated on the stack) are accessed via the name of the variable indexed by EBP:
int foo(int x){asm { mov EAX,x[EBP] ;// loads value of parameter x into EAX mov EAX,x ;// does the same thing }}
If the [EBP] is omitted, it is assumed for local variables. Ifnaked is used, this no longer holds.
jmp $ ;branches to the instruction following the jmp instruction. The $ can only appear as the target of a jmp or call instruction.
| aaa | aad | aam | aas | adc |
| add | addpd | addps | addsd | addss |
| and | andnpd | andnps | andpd | andps |
| arpl | bound | bsf | bsr | bswap |
| bt | btc | btr | bts | call |
| cbw | cdq | clc | cld | clflush |
| cli | clts | cmc | cmova | cmovae |
| cmovb | cmovbe | cmovc | cmove | cmovg |
| cmovge | cmovl | cmovle | cmovna | cmovnae |
| cmovnb | cmovnbe | cmovnc | cmovne | cmovng |
| cmovnge | cmovnl | cmovnle | cmovno | cmovnp |
| cmovns | cmovnz | cmovo | cmovp | cmovpe |
| cmovpo | cmovs | cmovz | cmp | cmppd |
| cmpps | cmps | cmpsb | cmpsd | cmpss |
| cmpsw | cmpxchg | cmpxchg8b | cmpxchg16b | |
| comisd | comiss | |||
| cpuid | cvtdq2pd | cvtdq2ps | cvtpd2dq | cvtpd2pi |
| cvtpd2ps | cvtpi2pd | cvtpi2ps | cvtps2dq | cvtps2pd |
| cvtps2pi | cvtsd2si | cvtsd2ss | cvtsi2sd | cvtsi2ss |
| cvtss2sd | cvtss2si | cvttpd2dq | cvttpd2pi | cvttps2dq |
| cvttps2pi | cvttsd2si | cvttss2si | cwd | cwde |
| da | daa | das | db | dd |
| de | dec | df | di | div |
| divpd | divps | divsd | divss | dl |
| dq | ds | dt | dw | emms |
| enter | f2xm1 | fabs | fadd | faddp |
| fbld | fbstp | fchs | fclex | fcmovb |
| fcmovbe | fcmove | fcmovnb | fcmovnbe | fcmovne |
| fcmovnu | fcmovu | fcom | fcomi | fcomip |
| fcomp | fcompp | fcos | fdecstp | fdisi |
| fdiv | fdivp | fdivr | fdivrp | feni |
| ffree | fiadd | ficom | ficomp | fidiv |
| fidivr | fild | fimul | fincstp | finit |
| fist | fistp | fisub | fisubr | fld |
| fld1 | fldcw | fldenv | fldl2e | fldl2t |
| fldlg2 | fldln2 | fldpi | fldz | fmul |
| fmulp | fnclex | fndisi | fneni | fninit |
| fnop | fnsave | fnstcw | fnstenv | fnstsw |
| fpatan | fprem | fprem1 | fptan | frndint |
| frstor | fsave | fscale | fsetpm | fsin |
| fsincos | fsqrt | fst | fstcw | fstenv |
| fstp | fstsw | fsub | fsubp | fsubr |
| fsubrp | ftst | fucom | fucomi | fucomip |
| fucomp | fucompp | fwait | fxam | fxch |
| fxrstor | fxsave | fxtract | fyl2x | fyl2xp1 |
| hlt | idiv | imul | in | inc |
| ins | insb | insd | insw | int |
| into | invd | invlpg | iret | iretd |
| iretq | ja | jae | jb | jbe |
| jc | jcxz | je | jecxz | jg |
| jge | jl | jle | jmp | jna |
| jnae | jnb | jnbe | jnc | jne |
| jng | jnge | jnl | jnle | jno |
| jnp | jns | jnz | jo | jp |
| jpe | jpo | js | jz | lahf |
| lar | ldmxcsr | lds | lea | leave |
| les | lfence | lfs | lgdt | lgs |
| lidt | lldt | lmsw | lock | lods |
| lodsb | lodsd | lodsw | loop | loope |
| loopne | loopnz | loopz | lsl | lss |
| ltr | maskmovdqu | maskmovq | maxpd | maxps |
| maxsd | maxss | mfence | minpd | minps |
| minsd | minss | mov | movapd | movaps |
| movd | movdq2q | movdqa | movdqu | movhlps |
| movhpd | movhps | movlhps | movlpd | movlps |
| movmskpd | movmskps | movntdq | movnti | movntpd |
| movntps | movntq | movq | movq2dq | movs |
| movsb | movsd | movss | movsw | movsx |
| movupd | movups | movzx | mul | mulpd |
| mulps | mulsd | mulss | neg | nop |
| not | or | orpd | orps | out |
| outs | outsb | outsd | outsw | packssdw |
| packsswb | packuswb | paddb | paddd | paddq |
| paddsb | paddsw | paddusb | paddusw | paddw |
| pand | pandn | pavgb | pavgw | pcmpeqb |
| pcmpeqd | pcmpeqw | pcmpgtb | pcmpgtd | pcmpgtw |
| pextrw | pinsrw | pmaddwd | pmaxsw | pmaxub |
| pminsw | pminub | pmovmskb | pmulhuw | pmulhw |
| pmullw | pmuludq | pop | popa | popad |
| popf | popfd | por | prefetchnta | prefetcht0 |
| prefetcht1 | prefetcht2 | psadbw | pshufd | pshufhw |
| pshuflw | pshufw | pslld | pslldq | psllq |
| psllw | psrad | psraw | psrld | psrldq |
| psrlq | psrlw | psubb | psubd | psubq |
| psubsb | psubsw | psubusb | psubusw | psubw |
| punpckhbw | punpckhdq | punpckhqdq | punpckhwd | punpcklbw |
| punpckldq | punpcklqdq | punpcklwd | push | pusha |
| pushad | pushf | pushfd | pxor | rcl |
| rcpps | rcpss | rcr | rdmsr | rdpmc |
| rdtsc | rep | repe | repne | repnz |
| repz | ret | retf | rol | ror |
| rsm | rsqrtps | rsqrtss | sahf | sal |
| sar | sbb | scas | scasb | scasd |
| scasw | seta | setae | setb | setbe |
| setc | sete | setg | setge | setl |
| setle | setna | setnae | setnb | setnbe |
| setnc | setne | setng | setnge | setnl |
| setnle | setno | setnp | setns | setnz |
| seto | setp | setpe | setpo | sets |
| setz | sfence | sgdt | shl | shld |
| shr | shrd | shufpd | shufps | sidt |
| sldt | smsw | sqrtpd | sqrtps | sqrtsd |
| sqrtss | stc | std | sti | stmxcsr |
| stos | stosb | stosd | stosw | str |
| sub | subpd | subps | subsd | subss |
| syscall | sysenter | sysexit | sysret | test |
| ucomisd | ucomiss | ud2 | unpckhpd | unpckhps |
| unpcklpd | unpcklps | verr | verw | wait |
| wbinvd | wrmsr | xadd | xchg | xlat |
| xlatb | xor | xorpd | xorps |
| addsubpd | addsubps | fisttp | haddpd | haddps |
| hsubpd | hsubps | lddqu | monitor | movddup |
| movshdup | movsldup | mwait |
| pavgusb | pf2id | pfacc | pfadd | pfcmpeq |
| pfcmpge | pfcmpgt | pfmax | pfmin | pfmul |
| pfnacc | pfpnacc | pfrcp | pfrcpit1 | pfrcpit2 |
| pfrsqit1 | pfrsqrt | pfsub | pfsubr | pi2fd |
| pmulhrw | pswapd |
SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2 and AVX are supported.
TheGNU D Compiler uses an alternative, GCC-based syntax for inline assembler:
GccAsmStatement:asmFunctionAttributesopt{GccAsmInstructionList}GccAsmInstructionList:GccAsmInstruction;GccAsmInstruction;GccAsmInstructionListGccAsmInstruction:GccBasicAsmInstructionGccExtAsmInstructionGccGotoAsmInstructionGccBasicAsmInstruction:GccAsmStringExpressionGccExtAsmInstruction:GccAsmStringExpression:GccAsmOperandsoptGccAsmStringExpression:GccAsmOperandsopt:GccAsmOperandsoptGccAsmStringExpression:GccAsmOperandsopt:GccAsmOperandsopt:GccAsmClobbersoptGccGotoAsmInstruction:GccAsmStringExpression::GccAsmOperandsopt:GccAsmClobbersopt:GccAsmGotoLabelsoptGccAsmStringExpression:StringLiteral(ConditionalExpression)GccAsmOperands:GccSymbolicNameoptGccAsmStringExpression(AssignExpression)GccSymbolicNameoptGccAsmStringExpression(AssignExpression),GccAsmOperandsGccSymbolicName:[Identifier]GccAsmClobbers:GccAsmStringExpressionGccAsmStringExpression,GccAsmClobbersGccAsmGotoLabels:IdentifierIdentifier,GccAsmGotoLabels