Note:Ideally most of what is written in this document would be directlyin the Javadocs of the relevant classes. This is not the case yet.
This page covers the specifics of writing a rule in Java. The basic developmentprocess is very similar to the process for XPath rules, which is described inYour First Rule.
Basically, you open the designer, look at the structure of the AST, and refineyour rule as you add test cases.
In this page we’ll talk about rules for the Java language, but the process isvery similar for other languages.
To write a rule in Java you’ll have to:
Rule
. Eachlanguage implementation provides a base rule class to ease your pain,e.g.AbstractJavaRule
.Most base rule classes use aVisitor patternto explore the AST.
When a rule is applied to a file, it’s handed the root of the AST and toldto traverse all the tree to look for violations. Each rule defines a specificvisit
method for each type of node for of the language, whichby default just visits the children.
So the following rule would traverse the whole tree and do nothing:
publicclassMyRuleextendsAbstractJavaRule{// all methods are default implementations!}
Generally, a rule wants to check for only some node types. In our XPath exampleinYour First Rule,we wanted to check for someVariableId
nodes. That’s the XPath name,but in Java, you’ll get access to theASTVariableId
full API.
If you want to check for some specific node types, you can override thecorrespondingvisit
method:
publicclassMyRuleextendsAbstractJavaRule{@OverridepublicObjectvisit(ASTVariableIdnode,Objectdata){// This method is called on each node of type ASTVariableId// in the ASTif(node.getType()==short.class){// reports a violation at the position of the node// the "data" parameter is a context object handed to by your rule// the message for the violation is the message defined in the rule declaration XML elementasCtx(data).addViolation(node);}// this calls back to the default implementation, which recurses further down the subtreereturnsuper.visit(node,data);}}
Thesuper.visit(node, data)
call is super common in rule implementations,because it makes the traversal continue by visiting all the descendants of thecurrent node.
Sometimes you have checked all you needed and you’re sure that the descendantsof a node may not contain violations. In that case, you can avoid calling thesuper
implementation and the traversal will not continue further down. Thismeans that your callbacks (visit
implementations) won’t be called on the restof the subtree. The siblings of the current node may be visitedrecursively nevertheless.
If you don’t care about the order in which the nodes are traversed (e.g. yourrule doesn’t maintain any state between visits), then you can monumentallyspeed-up your rule by using therulechain.
That mechanism doesn’t recurse on all the tree, instead, your rule will only bepassed the nodes it is interested in. To use the rulechain correctly:
buildTargetSelector
. This methodshould return a target selector, that selects all the node types you are interested in. E.g. the factorymethodforTypes
can be usedto create such a selector.AbstractJavaRulechainRule
. You’ll need to call the super constructor andprovide the node types you are interested in.super.visit
in the methods.In Java rule implementations, you often need to navigate the AST to find the interesting nodes.In yourvisit
implementation, you can start navigating the AST from the given node.
TheNode
interface provides a couple of useful methodsthat return aNodeStream
and can be used to query the AST:
The returned NodeStream API provides easy to use methods that follow the Java Stream API (java.util.stream
).
Example:
NodeStream.of(someNode)// the stream here is empty if the node is null.filterIs(ASTVariableDeclaratorId.class)// the stream here is empty if the node was not a variable declarator id.followingSiblings()// the stream here contains only the siblings, not the original node.filterIs(ASTVariableInitializer.class).children(ASTExpression.class).children(ASTPrimaryExpression.class).children(ASTPrimaryPrefix.class).children(ASTLiteral.class).filterMatching(Node::getImage,"0").filterNot(ASTLiteral::isStringLiteral).nonEmpty();// If the stream is non empty here, then all the pipeline matched
TheNode
interface provides also an alternative way to navigate the AST for convenience:
Depending on the AST of the language, there might also be more specific methods that can be used tonavigate. E.g. in Java there exists the methodASTIfStatement#getCondition
to get the condition of an If-statement.
In your visit method, you have access to theRuleContext
which is the entry point intoreporting back during the analysis.
addViolation
reports a rule violation atthe position of the given node with the message defined in the rule declaration XML element.{0}
.In that case, you need to calladdViolation
and provide the values for the placeholders. The message is actually processed as ajava.text.MessageFormat
.addViolationWithMessage
oraddViolationWithMessage
.Using these methods, the message defined in the rule declaration XML element isnot used.${propertyName}
.${methodName}
to insertthe name of the method in which the violation occurred.SeeJava-specific features and guidance.When starting execution, PMD will instantiate a new instance of your rule.If PMD is executed in multiple threads, then each thread is using its owninstance of the rule. This means, that the rule implementationdoes not need to care aboutthreading issues, as PMD makes sure, that a single instance is not used concurrentlyby multiple threads.
However, for performance reasons, the rule instances are reused for multiple files.This means, that the constructor of the rule is only executed once (per thread)and the rule instance is reused. If you rely on a proper initialization of instanceproperties, you can do the initialization in thestart
method of the rule(you need to override this method).The start method is called exactly once per file.
Some languages might support metrics.
Some languages might support symbol table.
Some languages might support type resolution.
Exactly once (per thread):
For each thread, a deep copy of the rule is created. Each thread is givena different set of files to analyse. Then, for each such file and for eachrule copy:
start
is called once, before parsingapply
is called with the rootof the AST. That method performs the AST traversal that ultimately calls visit methods.It’s not called for RuleChain rules.end
is called when the rule is done processingthe fileSeehttps://github.com/pmd/pmd-examples for a couple of example projects, thatcreate custom PMD rules for different languages.