Next:Including Patterns in Machine Descriptions., Previous:Defining RTL Sequences for Code Generation, Up:Machine Descriptions [Contents][Index]
There are two cases where you should specify how to split a patterninto multiple insns. On machines that have instructions requiringdelay slots (seeDelay Slot Scheduling) or that have instructions whoseoutput is not available for multiple cycles (seeSpecifying processor pipeline description), the compiler phases that optimize these cases need tobe able to move insns into one-instruction delay slots. However, someinsns may generate more than one machine instruction. These insnscannot be placed into a delay slot.
Often you can rewrite the single insn as a list of individual insns,each corresponding to one machine instruction. The disadvantage ofdoing so is that it will cause the compilation to be slower and requiremore space. If the resulting insns are too complex, it may alsosuppress some optimizations. The compiler splits the insn if there is areason to believe that it might improve instruction or delay slotscheduling.
The insn combiner phase also splits putative insns. If three insns aremerged into one insn with a complex expression that cannot be matched bysomedefine_insn pattern, the combiner phase attempts to splitthe complex pattern into two insns that are recognized. Usually it canbreak the complex pattern into two patterns by splitting out somesubexpression. However, in some other cases, such as performing anaddition of a large constant in two insns on a RISC machine, the way tosplit the addition into two insns is machine-dependent.
Thedefine_split definition tells the compiler how to split acomplex insn into several simpler insns. It looks like this:
(define_split [insn-pattern] "condition" [new-insn-pattern-1new-insn-pattern-2 …] "preparation-statements")
insn-pattern is a pattern that needs to be split andcondition is the final condition to be tested, as in adefine_insn. When an insn matchinginsn-pattern andsatisfyingcondition is found, it is replaced in the insn listwith the insns given bynew-insn-pattern-1,new-insn-pattern-2, etc.
Thepreparation-statements are similar to those statements thatare specified fordefine_expand (seeDefining RTL Sequences for Code Generation)and are executed before the new RTL is generated to prepare for thegenerated code or emit some insns whose pattern is not fixed. Unlikethose indefine_expand, however, these statements must notgenerate any new pseudo-registers. Once reload has completed, they alsomust not allocate any space in the stack frame.
There are two special macros defined for use in the preparation statements:DONE andFAIL. Use them with a following semicolon,as a statement.
DONE ¶Use theDONE macro to end RTL generation for the splitter. Theonly RTL insns generated as replacement for the matched input insn willbe those already emitted by explicit calls toemit_insn withinthe preparation statements; the replacement pattern is not used.
FAIL ¶Make thedefine_split fail on this occasion. When adefine_splitfails, it means that the splitter was not truly available for the inputsit was given, and the input insn will not be split.
If the preparation falls through (invokes neitherDONE norFAIL), then thedefine_split uses the replacementtemplate.
Patterns are matched againstinsn-pattern in two differentcircumstances. If an insn needs to be split for delay slot schedulingor insn scheduling, the insn is already known to be valid, which meansthat it must have been matched by somedefine_insn and, ifreload_completed is nonzero, is known to satisfy the constraintsof thatdefine_insn. In that case, the new insn patterns mustalso be insns that are matched by somedefine_insn and, ifreload_completed is nonzero, must also satisfy the constraintsof those definitions.
As an example of this usage ofdefine_split, consider the followingexample froma29k.md, which splits asign_extend fromHImode toSImode into a pair of shift insns:
(define_split [(set (match_operand:SI 0 "gen_reg_operand" "") (sign_extend:SI (match_operand:HI 1 "gen_reg_operand" "")))] "" [(set (match_dup 0) (ashift:SI (match_dup 1) (const_int 16))) (set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 16)))] "{ operands[1] = gen_lowpart (SImode, operands[1]); }")When the combiner phase tries to split an insn pattern, it is always thecase that the pattern isnot matched by anydefine_insn.The combiner pass first tries to split a singleset expressionand then the sameset expression inside aparallel, butfollowed by aclobber of a pseudo-reg to use as a scratchregister. In these cases, the combiner expects exactly one or two new insnpatterns to be generated. It will verify that these patterns match somedefine_insn definitions, so you need not do this test in thedefine_split (of course, there is no point in writing adefine_split that will never produce insns that match).
Here is an example of this use ofdefine_split, taken fromrs6000.md:
(define_split [(set (match_operand:SI 0 "gen_reg_operand" "") (plus:SI (match_operand:SI 1 "gen_reg_operand" "") (match_operand:SI 2 "non_add_cint_operand" "")))] "" [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3))) (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))]"{ int low = INTVAL (operands[2]) & 0xffff; int high = (unsigned) INTVAL (operands[2]) >> 16; if (low & 0x8000) high++, low |= 0xffff0000; operands[3] = GEN_INT (high << 16); operands[4] = GEN_INT (low);}")Here the predicatenon_add_cint_operand matches anyconst_int that isnot a valid operand of a single addinsn. The add with the smaller displacement is written so that itcan be substituted into the address of a subsequent operation.
An example that uses a scratch register, from the same file, generatesan equality comparison of a register and a large constant:
(define_split [(set (match_operand:CC 0 "cc_reg_operand" "") (compare:CC (match_operand:SI 1 "gen_reg_operand" "") (match_operand:SI 2 "non_short_cint_operand" ""))) (clobber (match_operand:SI 3 "gen_reg_operand" ""))] "find_single_use (operands[0], insn, 0) && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)" [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4))) (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))] "{ /*Get the constant we are comparing against, C, and see what it looks like sign-extended to 16 bits. Then see what constant could be XOR’ed with C to get the sign-extended value. */ int c = INTVAL (operands[2]); int sextc = (c << 16) >> 16; int xorv = c ^ sextc; operands[4] = GEN_INT (xorv); operands[5] = GEN_INT (sextc);}")To avoid confusion, don’t write a singledefine_split thataccepts some insns that match somedefine_insn as well as someinsns that don’t. Instead, write two separatedefine_splitdefinitions, one for the insns that are valid and one for the insns thatare not valid.
The splitter is allowed to split jump instructions into a sequence of jumps orcreate new jumps while splitting non-jump instructions. As the control flowgraph and branch prediction information needs to be updated after the splitterruns, several restrictions apply.
Splitting of a jump instruction into a sequence that has another jumpinstruction to the same label is always valid, as the compiler expectsidentical behavior of the new jump. When the new sequence contains multiplejump instructions or new labels, more assistance is needed. The splitter ispermitted to create only unconditional jumps, or simple conditional jumpinstructions. Additionally it must attach aREG_BR_PROB note to eachconditional jump. A global variablesplit_branch_probability holds theprobability of the original branch in case it was a simple conditional jump,−1 otherwise. To simplify recomputing of edge frequencies, the newsequence is permitted to have only forward jumps to the newly-created labels.
For the common case where the pattern of a define_split exactly matches thepattern of a define_insn, usedefine_insn_and_split. It looks likethis:
(define_insn_and_split [insn-pattern] "condition" "output-template" "split-condition" [new-insn-pattern-1new-insn-pattern-2 …] "preparation-statements" [insn-attributes])
insn-pattern,condition,output-template, andinsn-attributes are used as indefine_insn. Thenew-insn-pattern vector and thepreparation-statements are used asin adefine_split. Thesplit-condition is also used as indefine_split, with the additional behavior that if the condition startswith ‘&&’, the condition used for the split will be the constructed as alogical “and” of the split condition with the insn condition. For example,from i386.md:
(define_insn_and_split "zero_extendhisi2_and" [(set (match_operand:SI 0 "register_operand" "=r") (zero_extend:SI (match_operand:HI 1 "register_operand" "0"))) (clobber (reg:CC 17))] "TARGET_ZERO_EXTEND_WITH_AND && !optimize_size" "#" "&& reload_completed" [(parallel [(set (match_dup 0) (and:SI (match_dup 0) (const_int 65535))) (clobber (reg:CC 17))])] "" [(set_attr "type" "alu1")])
In this case, the actual split condition will be‘TARGET_ZERO_EXTEND_WITH_AND && !optimize_size && reload_completed’.
Thedefine_insn_and_split construction provides exactly the samefunctionality as two separatedefine_insn anddefine_splitpatterns. It exists for compactness, and as a maintenance tool to preventhaving to ensure the two patterns’ templates match.
It is sometimes useful to have adefine_insn_and_splitthat replaces specific operands of an instruction but leaves therest of the instruction pattern unchanged. You can do this directlywith adefine_insn_and_split, but it requires anew-insn-pattern-1 that repeats most of the originalinsn-pattern.There is also the complication that an implicitparallel ininsn-pattern must become an explicitparallel innew-insn-pattern-1, which is easy to overlook.A simpler alternative is to usedefine_insn_and_rewrite, whichis a form ofdefine_insn_and_split that automatically generatesnew-insn-pattern-1 by replacing eachmatch_operandininsn-pattern with a correspondingmatch_dup, and eachmatch_operator in the pattern with a correspondingmatch_op_dup.The arguments are otherwise identical todefine_insn_and_split:
(define_insn_and_rewrite [insn-pattern] "condition" "output-template" "split-condition" "preparation-statements" [insn-attributes])
Thematch_dups andmatch_op_dups in the newinstruction pattern use any new operand values that thepreparation-statements store in theoperands array,as for a normaldefine_insn_and_split.preparation-statementscan also emit additional instructions before the new instruction.They can even emit an entirely different sequence of instructions anduseDONE to avoid emitting a new form of the originalinstruction.
The split in adefine_insn_and_rewrite is only intendedto apply to existing instructions that matchinsn-pattern.split-condition must therefore start with&&,so that the split condition applies on top ofcondition.
Here is an example from the AArch64 SVE port, in which operand 1 isknown to be equivalent to an all-true constant and isn’t used by theoutput template:
(define_insn_and_rewrite "*while_ult<GPI:mode><PRED_ALL:mode>_cc" [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_operand:PRED_ALL 1) (unspec:PRED_ALL [(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ") (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")] UNSPEC_WHILE_LO)] UNSPEC_PTEST_PTRUE) (const_int 0))) (set (match_operand:PRED_ALL 0 "register_operand" "=Upa") (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO))] "TARGET_SVE" "whilelo\t%0.<PRED_ALL:Vetype>, %<w>2, %<w>3" ;; Force the compiler to drop the unused predicate operand, so that we ;; don't have an unnecessary PTRUE. "&& !CONSTANT_P (operands[1])" { operands[1] = CONSTM1_RTX (<MODE>mode); })The splitter in this case simply replaces operand 1 with the constantvalue that it is known to have. The equivalentdefine_insn_and_splitwould be:
(define_insn_and_split "*while_ult<GPI:mode><PRED_ALL:mode>_cc" [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_operand:PRED_ALL 1) (unspec:PRED_ALL [(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ") (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")] UNSPEC_WHILE_LO)] UNSPEC_PTEST_PTRUE) (const_int 0))) (set (match_operand:PRED_ALL 0 "register_operand" "=Upa") (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO))] "TARGET_SVE" "whilelo\t%0.<PRED_ALL:Vetype>, %<w>2, %<w>3" ;; Force the compiler to drop the unused predicate operand, so that we ;; don't have an unnecessary PTRUE. "&& !CONSTANT_P (operands[1])" [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_dup 1) (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO)] UNSPEC_PTEST_PTRUE) (const_int 0))) (set (match_dup 0) (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO))])] { operands[1] = CONSTM1_RTX (<MODE>mode); })Next:Including Patterns in Machine Descriptions., Previous:Defining RTL Sequences for Code Generation, Up:Machine Descriptions [Contents][Index]