microsoft/tolerant-php-parserPublic

NotificationsYou must be signed in to change notification settings
Fork82
Star891

An early-stage PHP parser designed for IDE usage scenarios.

License

MIT license

891 stars 82 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 893 Commits
.github/workflows		.github/workflows
.vscode		.vscode
ci		ci
docs		docs
experiments		experiments
php-langspec		php-langspec
src		src
syntax-visualizer		syntax-visualizer
tests		tests
tools		tools
validation		validation
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
Contributing.md		Contributing.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
ThirdPartyNotices.txt		ThirdPartyNotices.txt
composer.json		composer.json
phpstan.neon		phpstan.neon
phpunit.xml		phpunit.xml

Repository files navigation

Tolerant PHP Parser

This is an early-stage PHP parser designed, from the beginning, for IDE usage scenarios (seeDesign Goals for more details). There isstill a ton of work to be done, so at this point, this repo mostly serves asan experiment and the start of a conversation.

This is the v0.1 branch, which changes data structures to support syntax added after the initial 0.0.x release line.

Get Started

After you'veconfigured your machine, you can use the parser to generate and workwith the Abstract Syntax Tree (AST) via a friendly API.

<?php// Autoload required classesrequire__DIR__ ."/vendor/autoload.php";useMicrosoft\PhpParser\{DiagnosticsProvider,Node,Parser,PositionUtilities};// Instantiate new parser instance$parser =newParser();// Return and print an AST from string contents$astNode =$parser->parseSourceFile('<?php /* comment */ echo "hi!"');var_dump($astNode);// Gets and prints errors from AST Node. The parser handles errors gracefully,// so it can be used in IDE usage scenarios (where code is often incomplete).$errors = DiagnosticsProvider::getDiagnostics($astNode);var_dump($errors);// Traverse all Node descendants of $astNodeforeach ($astNode->getDescendantNodes()as$descendant) {if ($descendantinstanceofNode\StringLiteral) {// Print the Node text (without whitespace or comments)var_dump($descendant->getText());// All Nodes link back to their parents, so it's easy to navigate the tree.$grandParent =$descendant->getParent()->getParent();var_dump($grandParent->getNodeKindName());// The AST is fully-representative, and round-trippable to the original source.// This enables consumers to build reliable formatting and refactoring tools.var_dump($grandParent->getLeadingCommentAndWhitespaceText());    }// In addition to retrieving all children or descendants of a Node,// Nodes expose properties specific to the Node type.if ($descendantinstanceofNode\Expression\EchoExpression) {$echoKeywordStartPosition =$descendant->echoKeyword->getStartPosition();// To cut down on memory consumption, positions are represented as a single integer// index into the document, but their line and character positions are easily retrieved.$lineCharacterPosition = PositionUtilities::getLineCharacterPositionFromPosition($echoKeywordStartPosition,$descendant->getFileContents()        );echo"line:$lineCharacterPosition->line, character:$lineCharacterPosition->character";    }}

Note:the API is not yet finalized, so please file issues let us know what functionality you want exposed,and we'll see what we can do! Also please file any bugs with unexpected behavior in the parse tree. We're stillin our early stages, and any feedback you have is much appreciated 😃.

Design Goals

Error tolerant design - in IDE scenarios, code is, by definition, incomplete. In the case that invalid code is entered, theparser should still be able to recover and produce a valid + complete tree, as well as relevant diagnostics.
Fast and lightweight (should be able to parse several MB of source code per second,to leave room for other features).
- Memory-efficient data structures
- Allow for incremental parsing in the future
Adheres toPHP language spec,supports both PHP5 and PHP7 grammars
Generated AST provides properties (fully representative, etc.) necessary for semantic and transformationaloperations, which also need to be performant.
- Fully representative and round-trippable back to the text it was parsed from (all whitespace and comment "trivia" are included in the parse tree)
- Possible to easily traverse the tree through parent/child nodes
- < 100 ms UI response time,so each language server operation should be < 50 ms to leave room for all theother stuff going on in parallel.
Simple and maintainable over time - parsers have a tendency to getreallyconfusing, really fast, so readability and debug-ability is high priority.
Testable - the parser should produce provably valid parse trees. We achieve this by defining and continuously testinga set of invariants about the tree.
Friendly and descriptive API to make it easy for others to build on.
Written in PHP - make it as easy as possible for the PHP community to consume and contribute.

Current Status and Approach

To ensure a sufficient level of correctness at every step of the way, theparser is being developed using the following incremental approach:

Phase 1: Write lexer that does not support PHP grammar, but supports EOFand Unknown tokens. Write tests for all invariants.
Phase 2: Support PHP lexical grammar, lots of tests
Phase 3: Write a parser that does not support PHP grammar, but produces tree ofError Nodes. Write tests for all invariants.
Phase 4: Support PHP syntactic grammar, lots of tests
Phase 5 (in progress 🏃): Real-world validation and optimization
- Correctness: validate that there are no errors produced on sample codebases, benchmark against other parsers (investigate any instance of disagreement), fuzz-testing
- Performance: profile, benchmark against large PHP applications
Phase 6: Finalize API to make it as easy as possible for people to consume.

Additional notes

A few of the PHP grammatical constructs (namely yield-expression, and template strings)are not yet supported and there are also other miscellaneous bugs. However, because the parser is error-tolerant,these errors are handled gracefully, and the resulting tree is otherwise complete. To get a more holistic sense forwhere we are, you can run the "validation" test suite (seeContributing Guidelines for more infoon running tests). Or simply, take a look at the currentvalidation test results.

Even though we haven't yet begun the performance optimization stage, we have seen promising results so far,and have plenty more room for improvement. SeeHow It Works for details on our currentapproach, and run thePerformance Tests on yourown machine to see for yourself.

Learn more

🎯Design Goals - learn about the design goals of the project (features, performance metrics, and more).

📖Documentation - learn how to reference the parser from your project, and how to performoperations on the AST to answer questions about your code.

👀Syntax Visualizer Tool - get a more tangible feel for the AST. Get creative - see if you can break it!

📈Current Status and Approach - how much of the grammar is supported? Performance? Memory? API stability?

🔧How it works - learn about the architecture, design decisions, and tradeoffs.

💖Contribute! - learn how to get involved, check out some pointers to educational commits that'llhelp you ramp up on the codebase (even if you've never worked on a parser before),and recommended workflows that make it easier to iterate.

This project has adopted theMicrosoft Open Source Code of Conduct.For more information see theCode of Conduct FAQ or contactopencode@microsoft.com with any additional questions or comments.

About

An early-stage PHP parser designed for IDE usage scenarios.

Code of conduct

Contributing

Security policy

Activity

Custom properties

Stars

891 stars

Watchers

45 watching

Forks

82 forks

Report repository

Languages

PHP99.4%
Other0.6%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Tolerant PHP Parser

Get Started

Design Goals

Current Status and Approach

Additional notes

Learn more

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors35

Uh oh!

Languages

Movatterモバイル変換

License

microsoft/tolerant-php-parser

Folders and files

Latest commit

History

Repository files navigation

Tolerant PHP Parser

Get Started

Design Goals

Current Status and Approach

Additional notes

Learn more

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors35

Uh oh!

Languages

Packages