Thedefine_peephole2 definition tells the compiler how tosubstitute one sequence of instructions for another sequence,what additional scratch registers may be needed and what theirlifetimes must be.
(define_peephole2 [insn-pattern-1insn-pattern-2 …] "condition" [new-insn-pattern-1new-insn-pattern-2 …] "preparation-statements")
The definition is almost identical todefine_split(seeDefining How to Split Instructions) except that the pattern to match is not asingle instruction, but a sequence of instructions.
It is possible to request additional scratch registers for use in theoutput template. If appropriate registers are not free, the patternwill simply not match.
Scratch registers are requested with amatch_scratch pattern atthe top level of the input pattern. The allocated register (initially) willbe dead at the point requested within the original sequence. If the scratchis used at more than a single point, amatch_dup pattern at thetop level of the input pattern marks the last position in the input sequenceat which the register must be available.
Here is an example from the IA-32 machine description:
(define_peephole2 [(match_scratch:SI 2 "r") (parallel [(set (match_operand:SI 0 "register_operand" "") (match_operator:SI 3 "arith_or_logical_operator" [(match_dup 0) (match_operand:SI 1 "memory_operand" "")])) (clobber (reg:CC 17))])] "! optimize_size && ! TARGET_READ_MODIFY" [(set (match_dup 2) (match_dup 1)) (parallel [(set (match_dup 0) (match_op_dup 3 [(match_dup 0) (match_dup 2)])) (clobber (reg:CC 17))])] "")
This pattern tries to split a load from its use in the hopes that we’ll beable to schedule around the memory load latency. It allocates a singleSImode register of classGENERAL_REGS ("r") that needsto be live only at the point just before the arithmetic.
A real example requiring extended scratch lifetimes is harder to come by,so here’s a silly made-up example:
(define_peephole2 [(match_scratch:SI 4 "r") (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" "")) (set (match_operand:SI 2 "" "") (match_dup 1)) (match_dup 4) (set (match_operand:SI 3 "" "") (match_dup 1))] "/*determine 1 does not overlap 0 and 2 */" [(set (match_dup 4) (match_dup 1)) (set (match_dup 0) (match_dup 4)) (set (match_dup 2) (match_dup 4)) (set (match_dup 3) (match_dup 4))] "")If we had not added the(match_dup 4) in the middle of the inputsequence, it might have been the case that the register we chose at thebeginning of the sequence is killed by the first or secondset.
There are two special macros defined for use in the preparation statements:DONE andFAIL. Use them with a following semicolon,as a statement.
DONE ¶Use theDONE macro to end RTL generation for the peephole. Theonly RTL insns generated as replacement for the matched input insn willbe those already emitted by explicit calls toemit_insn withinthe preparation statements; the replacement pattern is not used.
FAIL ¶Make thedefine_peephole2 fail on this occasion. When adefine_peephole2fails, it means that the replacement was not truly available for theparticular inputs it was given. In that case, GCC may still apply alaterdefine_peephole2 that also matches the given insn pattern.(Note that this is different fromdefine_split, whereFAILprevents the input insn from being split at all.)
If the preparation falls through (invokes neitherDONE norFAIL), then thedefine_peephole2 uses the replacementtemplate.
Insns are scanned in forward order from beginning to end for each basicblock. Matches are attempted in order ofdefine_peephole2appearance in themd file. After a successful replacement,scanning for further opportunities fordefine_peephole2, resumeswith the first generated replacement insn as the first insn to bematched against alldefine_peephole2. For the example above,after its successful replacement, the first insn that can be matched byadefine_peephole2 is(set (match_dup 4) (match_dup 1)).