Important: This document might be incomplete and doesn’t answer all questions. In that case please reach out to usby opening adiscussion so that we can improve this guide.
Before you update
Before updating to PMD 7, you should first update to the latest PMD 6 version 6.55.0 and try to fix alldeprecation warnings.
There are a couple of deprecated things in PMD 6, you might encounter:
Properties: In order to define property descriptors, you should usePropertyFactory now.This factory can create properties of any type. E.g. instead ofStringProperty.named(...) usePropertyFactory.stringProperty(...).
Also note, thatuiOrder is gone. You can just remove it.
When reporting a violation, you might see a deprecation of theaddViolation methods. These methods have been movedtoRuleContext. E.g. instead ofaddViolation(data, node, ...) useasCtx(data).addViolation(node, ...).
When you are calling PMD from CLI, you need to stop using deprecated CLI params, e.g.
-no-cache ➡️--no-cache
-failOnViolation ➡️--fail-on-violation
-reportfile ➡️--report-file
-language ➡️--use-version
If you have written custom XPath rule, look out for warnings about deprecated XPath attributes. These warningsmight look like
WARNING: Use of deprecated attribute 'VariableId/@Image' by XPath rule 'VariableNaming' (in ruleset 'VariableNamingRule'), please use @Name instead
and often already suggest an alternative.
If you still reference rulesets or rules the old way which has been deprecated since 6.46.0:
<lang-name>-<ruleset-name>, egjava-basic, which resolves torulesets/java/basic.xml
the internal release number, eg600, which resolves torulesets/releases/600.xml
Such usages produce deprecation warnings that should be easy to spot, e.g.
Ruleset reference 'java-basic' uses a deprecated form, use 'rulesets/java/basic.xml' instead
Use the explicit forms of these references to be compatible with PMD 7.
Note: Since PMD 6, all rules are sorted into categories (such as “Best Practices”, “Design”, “Error Prone”)and the old rulesets likebasic.xml have been deprecated and have been removed with PMD 7.It is about time to create acustom ruleset.
Approaching 7.0.0
After that, migrate to the release candidates, and fix any problems you encounter. Start with 7.0.0-rc1 via7.0.0-rc2, 7.0.0-rc3 and 7.0.0-rc4 until you finally use 7.0.0.
You might encounter additionally the following types of problems:
If you use any programmatic API of PMD, first avoid any usage of deprecated or internal classes/methods. Theseare marked with one of these annotations:@Deprecated,@DeprecatedUtil700,@InternalApi.
Some of these classes are available until 7.0.0-rc4 but are finally removed with 7.0.0.
If you use Visualforce, then you need to change “vf” to “visualforce”, e.g.category/vf/security.xml ➡️category/visualforce/security.xml
If you use Velocity, then you need to change “vm” to “velocity”, e.g.category/vm/... ➡️category/velocity/...
The following topics describe well known migration challenges in more detail.
Use cases
I’m using only built-in rules
When you are using only built-in rules, then you should check, whether you use any deprecated rule. With PMD 7many deprecated rules are finally removed. You can see a complete list of theremoved rulesin the release notes for PMD 7.The release notes also mention the replacement rule, that should be used instead. For some rules, there is noreplacement.
Then many rules have been changed or improved. New properties have been added to make them more versatile orproperties have been removed, if they are not necessary anymore. Seechanged rulesin the release notes for PMD 7.
A handful of rules are new to PMD 7. You might want to check these out:new rules.
Once you have reviewed your ruleset(s), you can switch to PMD 7.
I’m using custom rules
Testing
Ideally, you have written good tests already for your custom rules - seeTesting your rules.This helps to identify problems early on.
The base test classesPmdRuleTst andSimpleAggregatorTst have been moved outof packagenet.sourceforge.pmd.testframework. You’ll need to adjust your imports.
Ruleset XML
The<rule> tag, that defines your custom rule, is required to have alanguage attribute now. This was always thecase for XPath rules, but is now a requirement for Java rules.
XPath rules
If you haveXPath based rules, the first step will be to migrate to XPath 2.0 and then to XPath 3.1.XPath 2.0 is available in PMD 6 already and can be used right away. PMD 7 will use by default XPath 3.1 andwon’t support XPath 1.0 anymore. The difference between XPath 2.0 and XPath 3.1 is not big, so your XPath 2.0can be expected to work in PMD 7 without any further changes. So the migration path is to simply migrate to XPath 2.0.
After you have migrated your XPath rules to XPath 2.0, remove the “version” property, since that has been removedwith PMD 7. PMD 7 by default uses XPath 3.1. See belowXPath for details.
Then change theclass attribute of your rule tonet.sourceforge.pmd.lang.rule.xpath.XPathRule - because theclassXPathRule has been moved into subpackagenet.sourceforge.pmd.lang.rule.xpath.
The custom XPath functiontypeOf has been removed (deprecated since 6.4.0).Use the functionpmd-java:typeIs orpmd-java:typeIsExactly instead.SeePMD extension functions for availablefunctions.
Java rules
If you haveJava based rules, and you are using rulechain, this works a bit different now. The RuleChain APIhas changed, see[core] Simplify the rulechain (#2490) for the full details.But in short, you don’t calladdRuleChainVisit(...) in the rule’s constructor anymore. Instead, youoverride the methodbuildTargetSelector:
Consider using the newNodeStream API to navigate with null-safety. This is optional.
Additionally, if you have created rules forJava - regardless whether it is a XPath based rule or a Java basedrule - you might need to adjust your queries or visitor methods. The Java AST has been refactored substantially.The easiest way is to use thePMD Rule Designer to see the structureof the AST. See the sectionJava AST below for details.
There is a shared ant script that wraps the calls to javacc:javacc-wrapper.xml. This should be used now.
PMD’s parser adapter for JavaCC generated parsers is called nowJjtreeParserAdapter. This is the class that needs to be implemented now.
There is no need anymore to write a customTokenManager - we have now a common base class for JavaCC generatedtoken managers. This base class isAbstractTokenManager.
A rule violation factory is not needed anymore. For language specific information on rule violations, there isnow aViolationDecorator that a language can implement. These ViolationDecoratorsare called when a violation is reported and they can provide the additional information. This information can beused by renderers viaRuleViolation#getAdditionalInfo.
A parser visitor adapter is not needed anymore. The visitor interface now provides a default implementation.Instead, a base visitor for the language should be created, which extendsAstVisitorBase.
A rule chain visitor is not needed anymore. PMD provides a common implementation that fits all languages.
I’ve extended PMD with a custom feature…
In that case we can’t provide a general guide unless we know the specific custom feature. If you are having difficultiesfinding your way around the PMD source code and javadocs and you don’t see the aspect of PMD documented you areusing, we are probably missing documentation. Please reach out to us by opening adiscussion. We then can enhance the documentation and/or the PMD API.
Special topics
Release downloads
The asset filenames of PMD onGitHub Releases arenowpmd-dist-<version>-bin.zip,pmd-dist-<version>-src.zip andpmd-dist-<version>-doc.zip.Keep that in mind, if you have an automated download script.
The structure inside the ZIP files stay the same, e.g. we still provide inside the binary distributionZIP file the base directorypmd-bin-<version>.
Unified start script on all platforms for all commands (PMD, CPD, Designer). Instead ofrun.sh andpmd.bat,we now havepmd only (technically on Windows, there is still apmd.bat, but it behaves the same).
Executing PMD from CLI now means:run.sh pmd /pmd.bat ➡️pmd check
If you don’t replace this argument, then “false” will be interpreted as a file to analyze. You might see thenan error message such as[main] ERROR net.sourceforge.pmd.cli.commands.internal.PmdCommand - No such file false.
PMD tries to display a progress bar. If you don’t want this (e.g. on a CI build server), you can disable thiswith--no-progress.
--no-ruleset-compatibility has been removed without replacement.
--stress (or-stress) has been removed without replacement.
Custom distribution packages
When creating a custom distribution which only integrates the languages you need, there are some changes to apply:
In addition to the language dependencies you want, you also need add a dependency tonet.sourceforge.pmd:pmd-cli in order to get the CLI classes.
When fetching the scripts for the CLI with “maven-dependency-plugin”, you need to additionally fetch thelogging configuration. That means, the line<includes>scripts/**,LICENSE</includes> needs to be changed to<includes>scripts/**,LICENSE,conf/**</includes>.
Since the assembly descriptorpmd-bin includes now optionally also a BOM (bill of material). If you want tocreate this for your custom distribution, simply add the following plugin configuration:
When you have custom rules, and you have written rule tests according to the guideTesting your rules, you might want to consider upgrading your other tests toJUnit 5. The tests in PMD 7 have been migrated to JUnit5 - including the rule testsfor the built-in rules.
When executing the rule tests, you need to make sure to have JUnit5 on the classpath - which you automaticallyget when you depend onnet.sourceforge.pmd:pmd-test. If you also have JUnit4 tests, you need to make sureto have ajunit-vintage-engineas well on the test classpath, so that all tests are executed. That means, you mightneed to add now a dependency to JUnit4 explicitly if needed.
CPD: Reported endcolumn is now exclusive
In PMD 6, the reported position of the duplicated tokens in CPD where always including, e.g. the followingdescribed a duplication of length 4 in PMD 6: beginLine=1, endLine=1, beginColumn=1, endColumn=4 - these arethe first 4 character in the first line. With PMD 7, the endColumn is nowexcluding. The same duplicationwill be reported in PMD 7 as: beginLine=1, endLine=1, beginColumn=1, endColumn=5.
The reported positions in a file follow now the usual meaning: line numbering starts from 1, begin line and end lineare inclusive, begin column is inclusive and end column is exclusive. This is the usual behavior of the mostcommon text editors and the PMD part already used that meaning in RuleViolations for a long time in PMD 6 already.
This only affects the XML report format as the others don’t provide column information.
Node API
Starting from one node in the AST, you can navigate to children or parents with the following methods. This isthe “traditional” way for simple cases. For more complex cases, consider to use the newNodeStream API.
Many methods available in PMD 6 have been deprecated and removed for a slicker API with consistent naming,that also integrates tightly with the NodeStream API.
Tip: First use PMD 7.0.0-rc3, which still has these methods. These methods are marked asdeprecated, so you can then start to change them. The replacement method is usually provided in the javadocs.That way you avoid being confronted with just compile errors.
In java rule implementations, you often need to navigate the AST to find the interesting nodes. In PMD 6, thiswas often done by callingjjtGetChild(int) orjjtGetParent(int) and then checking the node typewithinstanceof. There are also helper methods available, likegetFirstChildOfType(Class) orfindDescendantsOfType(Class). These methods might returnnull and you need to check this for everylevel.
The newNodeStream API provides easy to use methods that follow the Java Stream API (java.util.stream).
Many complex predicates about nodes can be expressed by testing the emptiness of a node stream.E.g. the following tests if the node is a variable declarator id initialized to the value0:
Example:
NodeStream.of(someNode)// the stream here is empty if the node is null.filterIs(ASTVariableId.class)// the stream here is empty if the node was not a variable id.followingSiblings()// the stream here contains only the siblings, not the original node.children(ASTNumericLiteral.class).filter(ASTNumericLiteral::isIntLiteral).filterMatching(ASTNumericLiteral::getValueAsInt,0).nonEmpty();// If the stream is non empty here, then all the pipeline matched
XPath 1.0 and 2.0 have some incompatibilities. TheXPath 2.0 specificationdescribes them precisely. Those are however mostly corner cases and XPathrules usually don’t feature any of them.
The incompatibilities that are most relevant to migrating your rules are notcaused by the specification, but by the different engines we use to runXPath 1.0 and 2.0 queries. Here’s a list of known incompatibilities:
The namespace prefixesfn: andstring: should not be mentioned explicitly.In XPath 2.0 mode, the engine will complain about an undeclared namespace, butthe functions are in the default namespace. Removing the namespace prefixes fixes it.
fn:substring("Foo", 1) →substring("Foo", 1)
Conversely, calls to custom PMD functions liketypeIsmust be prefixedwith the namespace of the declaring module (pmd-java).
typeIs("Foo") →pmd-java:typeIs("Foo")
Boolean attribute values on our 1.0 engine are represented as the string values"true" and"false". In 2.0 mode though, boolean values are truly representedas boolean values, which in XPath may only be obtained through the functionstrue() andfalse().If your XPath 1.0 rule tests an attribute like@Private="true", then it justneeds to be changed to@Private=true() when migrating. A type error will warnyou that you must update the comparison. More is explained onissue #1244.
"true",'true' →true()
"false",'false' →false()
In XPath 1.0, comparing a number to a string coerces the string to a number.In XPath 2.0, a type error occurs. Like for boolean values, numeric values arerepresented by our 1.0 implementation as strings, meaning that@BeginLine > "1"worked —that’s not the case in 2.0 mode.
@ArgumentCount >'1' →@ArgumentCount > 1
In XPath 1.0, the expression/Foo matches thechildren of the root namedFoo.In XPath 2.0, that expression matches the root, if it is namedFoo. Consider the following tree:
Foo└─Foo└─Foo
Then/Foo will match the root in XPath 2.0, and the other nodes (but not the root) in XPath 1.0.See e.g.an issue caused by this in Apex,with nested classes.
The custom function “pmd:matches” which checks a regular expression against a string has been removed,since there is a built-in function available since XPath 2.0 which can be used instead. If you use “pmd:matches”simply remove the “pmd:” prefix.
General AST Changes to avoid @Image
An abstract syntax tree should be abstract, but in the same time, should not be too abstract. One of thebase interfaces for PMD’s AST for all languages isNode, which providesthe methodsgetImage andhasImageEqualTo.However, these methods don’t necessarily make sense for all nodes in all contexts. That’s whygetImage()often returns justnull. Also, the name is not very describing. AST nodes should try to use more specificnames, such asgetValue() orgetName().
For PMD 7, most languages have been adapted. And when writing XPath rules, you need to replace@Image withwhatever is appropriate now (e.g.@Name). See below for details.
Apex and Visualforce
There are many usages of@Image. These will be refactored after PMD 7 is releasedby deprecating the attribute and providing alternatives.
There are still many usages of@Image which are not refactored yet. This will be done after PMD 7 is releasedby deprecating the attribute and providing alternatives.
When usingXPathRule, text of text nodes was exposed as@Image ofnormal element type nodes. Now the attribute is called@Text.
Note: In general, it is recommended to useDomXPathRule instead,which exposes text nodes as real XPath/XML text nodes which conforms to the XPath spec.There is no difference, text of text nodes can be selected usingtext().
Java AST
The Java grammar has been refactored substantially in order to make it easier to maintain and more correctregarding the Java Language Specification.
Here you can see the most important changes as a comparison between the PMD 6 AST (“Old AST”) andPMD 7 AST (“New AST”) and with some background info about the changes.
When in doubt, it is recommended to use thePMD Designerwhich can also display the AST.
What: Annotations are consolidated into a single node.SingleMemberAnnotation,NormalAnnotation andMarkerAnnotationare removed in favour ofASTAnnotation. The Name node is removed, replaced by aASTClassType.
Why: Those different node types implement a syntax-only distinction, that only makes semantically equivalent annotationshave different possible representations. For example,@A and@A() are semantically equivalent, yet they wereparsed as MarkerAnnotation resp. NormalAnnotation. Similarly,@A("") and@A(value="") were parsed asSingleMemberAnnotation resp. NormalAnnotation. This also makes parsing much simpler. The nested ClassOrInterfacetype is used to share the disambiguation logic.
What:ASTAnnotations are now nested within the node, to which they are applied to.E.g. if a method is annotated, the Annotation node is now a child of aASTModifierList,inside theASTMethodDeclaration.
Why: Fixes a lot of inconsistencies, where sometimes the annotations were inside the node, and sometimes justsomewhere in the parent, with no real structure.
some syntactic contexts only allow reference types, other allow any kind of type. If you want to match all typesof a program, then matching Type would be the intuitive solution. But in 6.0.x, it wouldn’t have sufficed,since in some contexts, no Type node was pushed, only a ReferenceType
Regardless of the original syntactic context, any reference typeis a type, and searching for ASTType shouldyield all the types in the tree.
Using interfaces allows to abstract behaviour and make a nicer and safer API.
Migrating
There is currently no way to match abstract types (or interfaces) with XPath, soTypeandReferenceType name tests won’t match anything anymore.
Note that in most cases you should check the type of a variable with e.g.VariableId[pmd-java:typeIs("java.lang.String[]")] because itconsiders the additional dimensions on declarations likeString foo[];.The Java equivalent isTypeHelper.isA(id, String[].class);
Type and ReferenceType Examples
Code
Old AST (PMD 6)
New AST (PMD 7)
// in the context of a variable declarationList<String>strs;
ASTTypeArgument is removed. Instead, theASTTypeArguments node contains directlya sequence ofASTType nodes. To support this, the new node typeASTWildcardTypecaptures the syntax previously parsed as a TypeArgument.
TheASTWildcardBounds node is removed. Instead, the bound is a direct child of the WildcardType.
Why: Because wildcard types are types in their own right, and having a node to represent them skims several levelsof nesting off.
What: Remove the Name node in imports and package declaration nodes.
Why: Name is a TypeNode, but it’s equivalent toASTAmbiguousName in that it describes nothingabout what it represents. The name in an import may represent a method name, a type name, a field name…It’s too ambiguous to treat in the parser and could just be the image of the import, or package, or module.
What:ModifierOwner (formerly AccessNode) is now based on a node:ASTModifierList.That node representsmodifiers occurring before a declaration. It provides a flexible API to query modifiers, both explicit andimplicit. All declaration nodes now have such a modifier list, even if it’s implicit (no explicit modifiers).
Why: ModifierOwner (formerly AccessNode) gave a lot of irrelevant methods to its subtypes.E.g.ASTFieldDeclaration::isSynchronizedmakes no sense. Now, these irrelevant methods don’t clutter the API. The API of ModifierList is both moregeneral and flexible.
What: Removes the generic Name node and uses insteadASTClassType where appropriate. Alsouses specific node types for different directives (requires, exports, uses, provides).
What: Simplify and align the grammar used for method and constructor declarations. The methods in an annotationtype are now also method declarations.
Why: The method declaration had a nested node “MethodDeclarator”, which was not available for constructordeclarations. This made it difficult to write rules, that concern both methods and constructors withoutexplicitly differentiate between these two.
What: A separate node typeASTReceiverParameter is introduced to differentiate it from formal parameters.
Why: A receiver parameter is not a formal parameter, even though it looks like one: it doesn’t declare a variable,and doesn’t affect the arity of the method or constructor. It’s so rarely used that giving it its own node avoidsmatching it by mistake and simplifies the API and grammar of the ubiquitousASTFormalParameterandASTVariableId.
What: Statements are flattened. There are no superfluous BlockStatement and Statement nodes anymore.All children of aASTBlock are by definitionASTStatements, which is now an interface implemented by all statements.
Why: This simplifies the tree traversal. The removed nodes BlockStatement and Statement didn’t add anyadditional information. We only need a Statement abstraction. BlockStatement was used to enforce, that novariable or local class declaration is found alone as the child of e.g. an unbraced if, else, for, etc.This is a parser-only distinction that’s not that useful for analysis later on.
What: New node for For-each statements:ASTForeachStatement instead of ForStatement.
Why: This makes it a lot easier to distinguish in the AST between For-loops and For-Each-loops. E.g. somerules only apply to one or the other, and it was complicated to write a rule that works with both differentsubtrees (for loops have additional children ForInit and ForUpdate)
Why: ExpressionStatement is now aASTStatement, that can be used as a child in ablock. It itself has only one child, which is some kind ofASTExpression,which can be really any kind of expression (like assignment).In order to allow local class declarations as part of a block, we introducedASTLocalClassStatementwhich is a statement that carries a type declaration. Now blocks are just a list of statements.This allows us to have two distinct hierarchies for expressions and statements.
What: The AST representation of a try-with-resources statement has been simplified.It uses nowASTLocalVariableDeclaration unless it is a concise try-with-resources.
Why: Simpler integration try-with-resources into symboltable and type resolution.
ASTExpression andASTPrimaryExpression havebeen turned into interfaces. These added no information to the AST and increasedits depth unnecessarily. All expressions implement the first interface. Both ofthose nodes can no more be found in ASTs.
Migrating:
Basically,Expression/X orExpression/PrimaryExpression/X, just becomesX
There is currently no way to match abstract or interface types with XPath, soExpression orPrimaryExpressionname tests won’t match anything anymore. However, the axis step *[@Expression=true()] matches any expression.
Why: The fact thatASTNullLiteralandASTBooleanLiteral were nested within it but other literals types were all directly representedby it was inconsistent, and ultimately that level of nesting was unnecessary.
Why: It was extremely difficult to identify method calls in PMD 6 - these consisted of multiple nodes withprimary prefix, suffix and expressions. This was too low level to be easy to be used.
This makes the AST more regular and easier to navigate. Each node containsthe other nodes that are relevant to it (e.g. arguments) instead of thembeing spread out over several siblings. The API of all nodes has beenenriched with high-level accessors to query the AST in a semantic way,without bothering with the placement details.
The amount of changes in the grammar that this change entails is enormous,but hopefully firing up the designer to inspect the new structure shouldgive you the information you need quickly.
Note: this also affect binary expressions likeASTInfixExpression.E.g.a+b+c is not parsed as
AdditiveExpression+ (a)+ (b)+ (c)
But it is now (note: AdditiveExpression is now InfixExpression)
InfixExpression+ InfixExpression + (a) + (b)+ (c)
Field access, array access, variable access
What: New nodes dedicated to accessing field, variables and referencing arrays.Also provide info about the access type, like whether a variable is read or written.
What: Merge AST nodes for postfix and prefix expressions into the singleASTUnaryExpression node.The merged nodes are:
PreIncrementExpression
PreDecrementExpression
UnaryExpression
UnaryExpressionNotPlusMinus
Why: Those nodes were asymmetric, and inconsistently nested within UnaryExpression. By definition, they’re all unary,so that using a single node is appropriate.
What: For each operator, there were separate AST nodes (like AdditiveExpression, AndExpression, …).These are now unified into aInfixExpression, which gives access to the operator viagetOperator()and to the operands (getLhs(),getRhs()). Additionally, the resulting AST is not flat anymore,but a more structured tree.
Why: Having different AST node types doesn’t add information, that the operator doesn’t already provide.The new structure as a result, that the expressions are now parsed left recursive, makes the AST more JLS-like.This makes it easier for the type mapping algorithms. It also provides the information, which operands areused with which operator. This information was lost if more than 2 operands where used and the tree wasflattened with PMD 6.
What: Parentheses are not modelled in the AST anymore, but can be checked with the attributes@Parenthesizedand@ParenthesisDepth
Why: This keeps the tree flat while still preserving the information. The tree is the same in case of unnecessaryparenthesis, which makes it harder to fool rules that look at the structure of the tree.
No attribute@Synthetic anymore. Unlike Jorje, Summit AST doesn’t generate synthetic methods anymore, sothis attribute would have been always false and is of no use. Therefore it has been removed completely.
There will be no methods anymore with the name<clinit>,<init>.
There is no nodeBridgeMethodCreator anymore. This was an artificially generated node by Jorje. Since thenew parser doesn’t generate synthetic methods anymore, this node is not needed anymore.
There is in general no attribute@Namespace anymore. The attribute has been removed, as it was never fullyimplemented. It always returned an empty string.
No attribute@Context anymore. It was not used and always returnednull.
Language versions
Since all languages now have defined language versions, you could now write rules that apply only for specificversions (usingminimumLanguageVersion andmaximumLanguageVersion).
All languages have a default version. If no specific version on the CLI is given using--use-version, thenthis default version will be used. Usually the latest version is the default version.
The available versions for each language can be seen in the help message of the CLIpmd check --help.
CPD Module discovery change. The service loader won’t load anymoresrc/main/resources/META-INF/services/net.sourceforge.pmd.cpd.Languagebut insteadsrc/main/resources/META-INF/services/net.sourceforge.pmd.lang.Language. This is the unifiedlanguage interface for both PMD and CPD capable languages. See also the subinterfacesCpdCapableLanguage andPmdCapableLanguage.
When you switch from PMD 6.x to PMD 7 in your build tools, you most likely need to review yourruleset(s) as well and check for removed rules.See the use caseI’m using only built-in rules above.
Ant
The Ant tasksPMDTask andCPDTask have been moved from the modulepmd-core into the new modulepmd-ant.
You need to add this dependency/jar file onto the class path (net.sourceforge.pmd:pmd-ant) in order toimport the tasks into your build file.
When using the guideAnt Task Usage then no change is needed, sincethe pmd-ant jar file is included in the binary distribution of PMD. It is part of PMD’s lib folder.
Maven
Since maven-pmd-plugin 3.22.0, PMD 7 is supported directly.