Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork29
Fast lexer to extract named exports via analysis from CommonJS modules
License
nodejs/cjs-module-lexer
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Avery fast JS CommonJS module syntax lexer used to detect the most likely list of named exports of a CommonJS module.
Outputs the list of named exports (exports.name = ...) and possible module reexports (module.exports = require('...')), including the common transpiler variations of these cases.
Forked fromhttps://github.com/guybedford/es-module-lexer.
Comprehensively handles the JS language grammar while remaining small and fast. - ~90ms per MB of JS cold and ~15ms per MB of JS warm,see benchmarks for more info.
This project is used in Node.js core for detecting the named exports available when importing a CJS module into ESM, and is maintained for this purpose.
PRs will be accepted and upstreamed for parser bugs, performance improvements or new syntax support only.
Detection patterns for this project arefrozen. This is because adding any new export detection patterns would result in fragmented backwards-compatibility. Specifically, it would be very difficult to figure out why an ES module named export for CommonJS might work in newer Node.js versions but not older versions. This problem would only be discovered downstream of module authors, with the fix for module authors being to then have to understand which patterns in this project provide full backwards-compatibily. Rather, by fully freezing the detected patterns, if it works in any Node.js version it will work in any other. Build tools can also reliably treat the supported syntax for this project as a part of their output target for ensuring syntax support.
npm install cjs-module-lexerFor use in CommonJS:
const{ parse}=require('cjs-module-lexer');// `init` return a promise for parity with the ESM API, but you do not have to call itconst{ exports, reexports}=parse(` // named exports detection module.exports.a = 'a'; (function () { exports.b = 'b'; })(); Object.defineProperty(exports, 'c', { value: 'c' }); /* exports.d = 'not detected'; */ // reexports detection if (maybe) module.exports = require('./dep1.js'); if (another) module.exports = require('./dep2.js'); // literal exports assignments module.exports = { a, b: c, d, 'e': f } // __esModule detection Object.defineProperty(module.exports, '__esModule', { value: true })`);// exports === ['a', 'b', 'c', '__esModule']// reexports === ['./dep1.js', './dep2.js']
When using the ESM version, Wasm is supported instead:
import{parse,init}from'cjs-module-lexer';// init() needs to be called and waited upon, or use initSync() to compile// Wasm blockingly and synchronously.awaitinit();const{ exports, reexports}=parse(source);
The Wasm build is around 1.5x faster and without a cold start.
CommonJS exports matches are run against the source token stream.
The token grammar is:
IDENTIFIER: As defined by ECMA-262, without support for identifier `\` escapes, filtered to remove strict reserved words: "implements", "interface", "let", "package", "private", "protected", "public", "static", "yield", "enum"STRING_LITERAL: A `"` or `'` bounded ECMA-262 string literal.MODULE_EXPORTS: `module` `.` `exports`EXPORTS_IDENTIFIER: MODULE_EXPORTS_IDENTIFIER | `exports`EXPORTS_DOT_ASSIGN: EXPORTS_IDENTIFIER `.` IDENTIFIER `=`EXPORTS_LITERAL_COMPUTED_ASSIGN: EXPORTS_IDENTIFIER `[` STRING_LITERAL `]` `=`EXPORTS_LITERAL_PROP: (IDENTIFIER (`:` IDENTIFIER)?) | (STRING_LITERAL `:` IDENTIFIER)EXPORTS_SPREAD: `...` (IDENTIFIER | REQUIRE)EXPORTS_MEMBER: EXPORTS_DOT_ASSIGN | EXPORTS_LITERAL_COMPUTED_ASSIGNEXPORTS_DEFINE: `Object` `.` `defineProperty `(` EXPORTS_IDENFITIER `,` STRING_LITERALEXPORTS_DEFINE_VALUE: EXPORTS_DEFINE `, {` (`enumerable: true,`)? ( `value:` | `get` (`: function` IDENTIFIER? )? `() {` return IDENTIFIER (`.` IDENTIFIER | `[` STRING_LITERAL `]`)? `;`? `}` `,`? ) `})`EXPORTS_LITERAL: MODULE_EXPORTS `=` `{` (EXPORTS_LITERAL_PROP | EXPORTS_SPREAD) `,`)+ `}`REQUIRE: `require` `(` STRING_LITERAL `)`EXPORTS_ASSIGN: (`var` | `const` | `let`) IDENTIFIER `=` (`_interopRequireWildcard (`)? REQUIREMODULE_EXPORTS_ASSIGN: MODULE_EXPORTS `=` REQUIREEXPORT_STAR: (`__export` | `__exportStar`) `(` REQUIREEXPORT_STAR_LIB: `Object.keys(` IDENTIFIER$1 `).forEach(function (` IDENTIFIER$2 `) {` ( ( `if (` IDENTIFIER$2 `===` ( `'default'` | `"default"` ) `||` IDENTIFIER$2 `===` ( '__esModule' | `"__esModule"` ) `) return` `;`? ( (`if (Object` `.prototype`? `.hasOwnProperty.call(` IDENTIFIER `, ` IDENTIFIER$2 `)) return` `;`?)? (`if (` IDENTIFIER$2 `in` EXPORTS_IDENTIFIER `&&` EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] ===` IDENTIFIER$1 `[` IDENTIFIER$2 `]) return` `;`)? )? ) | `if (` IDENTIFIER$2 `!==` ( `'default'` | `"default"` ) (`&& !` (`Object` `.prototype`? `.hasOwnProperty.call(` IDENTIFIER `, ` IDENTIFIER$2 `)` | IDENTIFIER `.hasOwnProperty(` IDENTIFIER$2 `)`))? `)` ) ( EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] =` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? | `Object.defineProperty(` EXPORTS_IDENTIFIER `, ` IDENTIFIER$2 `, { enumerable: true, get` (`: function` IDENTIFIER? )? `() { return ` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? `}` `,`? `})` `;`? ) `})`Spacing between tokens is taken to be any ECMA-262 whitespace, ECMA-262 block comment or ECMA-262 line comment.
- The returned export names are taken to be the combination of:
- All
IDENTIFIERandSTRING_LITERALslots forEXPORTS_MEMBERandEXPORTS_LITERALmatches. - The first
STRING_LITERALslot for allEXPORTS_DEFINE_VALUEmatches where that same string is not anEXPORTS_DEFINEmatch that is not also anEXPORTS_DEFINE_VALUEmatch.
- All
- The reexport specifiers are taken to be the combination of:
- The
REQUIREmatches of the last matched of eitherMODULE_EXPORTS_ASSIGNorEXPORTS_LITERAL. - Alltop-level
EXPORT_STARREQUIREmatches andEXPORTS_ASSIGNmatches whoseIDENTIFIERalso matches the firstIDENTIFIERinEXPORT_STAR_LIB.
- The
The basic matching rules for named exports areexports.name,exports['name'] orObject.defineProperty(exports, 'name', ...). This matching is done without scope analysis and regardless of the expression position:
// DETECTS EXPORTS: a, b(function(exports){exports.a='a';exports['b']='b';})(exports);
Because there is no scope analysis, the above detection may overclassify:
// DETECTS EXPORTS: a, b, c(function(exports,Object){exports.a='a';exports['b']='b';if(false)exports.c='c';})(NOT_EXPORTS,NOT_OBJECT);
It will in turn underclassify in cases where the identifiers are renamed:
// DETECTS: NO EXPORTS(function(e){e.a='a';e['b']='b';})(exports);
Object.defineProperty is detected for specifically value and getter forms returning an identifier or member expression:
// DETECTS: a, b, c, d, __esModuleObject.defineProperty(exports,'a',{enumerable:true,get:function(){returnq.p;}});Object.defineProperty(exports,'b',{enumerable:true,get:function(){returnq['p'];}});Object.defineProperty(exports,'c',{enumerable:true,get(){returnb;}});Object.defineProperty(exports,'d',{value:'d'});Object.defineProperty(exports,'__esModule',{value:true});
Value properties are also detected specifically:
Object.defineProperty(exports,'a',{value:'no problem'});
To avoid matching getters that have side effects, any getter for an export name that does not support the forms above willopt-out of the getter matching:
// DETECTS: NO EXPORTSObject.defineProperty(exports,'a',{get(){return'nope';}});if(false){Object.defineProperty(module.exports,'a',{get(){returndynamic();}})}
Alternative object definition structures or getter function bodies are not detected:
// DETECTS: NO EXPORTSObject.defineProperty(exports,'a',{enumerable:false,get(){returnp;}});Object.defineProperty(exports,'b',{configurable:true,get(){returnp;}});Object.defineProperty(exports,'c',{get:()=>p});Object.defineProperty(exports,'d',{enumerable:true,get:function(){returndynamic();}});Object.defineProperty(exports,'e',{enumerable:true,get(){return'str';}});
Object.defineProperties is also not supported.
A best-effort is made to detectmodule.exports object assignments, but because this is not a full parser, arbitrary expressions are not handled in theobject parsing process.
Simple object definitions are supported:
// DETECTS EXPORTS: a, b, cmodule.exports={ a,'b':b,c:c, ...d};
Object properties that are not identifiers or string expressions will bail out of the object detection, while spreads are ignored:
// DETECTS EXPORTS: a, bmodule.exports={ a, ...d,b:require('c'),c:"not detected since require('c') above bails the object detection"}
Object.defineProperties is not currently supported either.
Anymodule.exports = require('mod') assignment is detected as a reexport, but only the last one is returned:
// DETECTS REEXPORTS: cmodule.exports=require('a');(module=>module.exports=require('b'))(NOT_MODULE);if(false)module.exports=require('c');
This is to avoid over-classification in Webpack bundles with externals which includemodule.exports = require('external') in their source for every external dependency.
In exports object assignment, any spread ofrequire() are detected as multiple separate reexports:
// DETECTS REEXPORTS: a, bmodule.exports=require('ignored');module.exports={ ...require('a'), ...require('b')};
For named exports, transpiler output works well with the rules described above.
But for star re-exports, special care is taken to support common patterns of transpiler outputs from Babel and TypeScript as well as bundlers like RollupJS.These reexport and star reexport patterns are restricted to only be detected at the top-level as provided by the direct output of these tools.
For example,export * from 'external' is output by Babel as:
"use strict";exports.__esModule=true;var_external=require("external");Object.keys(_external).forEach(function(key){if(key==="default"||key==="__esModule")return;exports[key]=_external[key];});
Where thevar _external = require("external") is specifically detected as well as theObject.keys(_external) statement, down to the exactfor of that entire expression including minor variations of the output. The_external andkey identifiers are carefully matched in thisdetection.
Similarly for TypeScript,export * from 'external' is output as:
"use strict";function__export(m){for(varpinm)if(!exports.hasOwnProperty(p))exports[p]=m[p];}Object.defineProperty(exports,"__esModule",{value:true});__export(require("external"));
Where the__export(require("external")) statement is explicitly detected as a reexport, including variationstslib.__export and__exportStar.
Node.js 10+, andall browsers with Web Assembly support.
- Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
- Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
- Always correctly parses valid JS source, but may parse invalid JS source without errors.
Benchmarks can be run withnpm run bench.
Current results:
JS Build:
Module load time> 4msCold Run, All Samplestest/samples/*.js (3635 KiB)> 299msWarm Runs (average of 25 runs)test/samples/angular.js (1410 KiB)> 13.96mstest/samples/angular.min.js (303 KiB)> 4.72mstest/samples/d3.js (553 KiB)> 6.76mstest/samples/d3.min.js (250 KiB)> 4mstest/samples/magic-string.js (34 KiB)> 0.64mstest/samples/magic-string.min.js (20 KiB)> 0mstest/samples/rollup.js (698 KiB)> 8.48mstest/samples/rollup.min.js (367 KiB)> 5.36msWarm Runs, All Samples (average of 25 runs)test/samples/*.js (3635 KiB)> 40.28msWasm Build:
Module load time> 10msCold Run, All Samplestest/samples/*.js (3635 KiB)> 43msWarm Runs (average of 25 runs)test/samples/angular.js (1410 KiB)> 9.32mstest/samples/angular.min.js (303 KiB)> 3.16mstest/samples/d3.js (553 KiB)> 5mstest/samples/d3.min.js (250 KiB)> 2.32mstest/samples/magic-string.js (34 KiB)> 0.16mstest/samples/magic-string.min.js (20 KiB)> 0mstest/samples/rollup.js (698 KiB)> 6.28mstest/samples/rollup.min.js (367 KiB)> 3.6msWarm Runs, All Samples (average of 25 runs)test/samples/*.js (3635 KiB)> 27.76msThe build uses docker and make, they must be installed first.
To build the lexer wasm runnpm run build-wasm.
Optimization passes are run withBinaryenprior to publish to reduce the Web Assembly footprint.
After building the lexer wasm, build the final distribution components(lexer.js and lexer.mjs) by runningnpm run build.
If you need to build lib/lexer.wat (optional) you must first installwabt as a sibling folder to thisproject. The wat file is then build by runningmake lib/lexer.wat
These are the steps to create and publish a release. You will need dockerinstalled as well as having installedwabtas outlined above:
- Figure out if the release should be semver patch, minor or major based on the changes sincethe last release and determine the new version.
- Update the package.json version, and run a full build and test
- npm install
- npm run build
- npm run test
- Commit and tag the changes, pushing up to main and the tag
- For example
git tag -a 1.4.2 -m "1.4.2"git push origin tag 1.4.2
- For example
- Create the GitHub release
- Run npm publish from an account with access (asking somebody with accessthe nodejs-foundation account is an option if you don't have access.
MIT
About
Fast lexer to extract named exports via analysis from CommonJS modules
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Packages0
Uh oh!
There was an error while loading.Please reload this page.