- Notifications
You must be signed in to change notification settings - Fork14
Performant taint analysis for Node.js
License
nuprl/augur
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Augur is a dynamic taint analysis for Node.js implemented in TypeScriptusingNodeProf. Check out thepaper!
Augur builds upon the technique described inIchnaea. It is more performant, supports the latest version of JavaScript, and is highly configurable to support any type of dynamic data-flow analysis.
Taint analysis is a dynamic program analysis technique used to trackdata flows through programs. It's useful in many domains, but the mostcommon application of taint analysis is for detecting injection vulnerabilities.Readthis paper for more backgroundon taint analysis.
Let's walk through how to install Augur and use it to analyze a Node.js project.
First, install Augur's dependencies if you don't already have them:node
,npm
, anddocker
.
Clone this project onto your machine, then build Augur:
git clone --recurse-submodules https://github.com/nuprl/augurcd augur/ts./docker-nodeprof/docker-pull.sh # Pull NodeProf Docker imagenpm install # Install Augur depsnpm run build # Build Augur
Try running a basic test to make sure your installation succeeded:
./node_modules/.bin/jest -t basic-assignment-tainted# tainted value flowed into sink {"type":"variable","name":"z","location":{"fileName":"test.js"}}!
Your Augur installation is now set up!
Using Augur to analyze your own applications simply requires placing afile,spec.json
, at theroot of your Node.js project:
{ "main": "test.js", "sources": [ { "type": "functionReturn", "name": "readFileSync" } ], "sinks": [ { "type": "functionInvocation", "name": "exec" } ]}
This file tells Augur thesources andsinks of the flows you want totrack. The spec above tells Augur to alert you if any value returned fromreadFileSync
flows into the functionexec
. It also tells Augur how to runyour project: by executing the filetest.js
.Here are all the options forspec.json
.
Let's say we analyze the following program,test.js
:
constfs=require('fs');constchild_process=require('child_process');// read in user's message from a fileletinput=fs.readFileSync("message.txt");// echo the user's messagechild_process.exec("echo 'User\'s message: "+input+"'");
This exact flow is a classic example of aninjection vulnerability:exec
isa very powerful function, giving the command full control over your machine; andreadFileAsync
returns arbitraryuser input, meaning the user may have fullcontrol of your machine. This can cause massive security issues, as well asbugs with disastrous consequences.
Let's go ahead and use Augur to verify that there is indeed a taint flowbetween user input andexec
.
Our project is structured like this:
project/|+-- test.js|+-- message.txt|+-- spec.json
This example is also a real test case in Augur.
To analyze this project with Augur, we run:
cd augur/tsnode ./runner/cli.js --projectDir ~/project --projectName project --outputDir . # ^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^ # path to project | | # | | # project name | # | # directory to store temp files
Augur will alert us that the application does in fact have the flow we'reexpecting:
Flows found into the following sinks: [ { "type": "functionInvocation", "name": "exec" }]
You've now analyzed your first application using Augur!
- Support forany piece of JavaScript code to act as a taint source
- Support forany piece of JavaScript code to act as a taint sink
- Support for taint sanitizers
- Support for different forms of taint tracking, from simple boolean trackingto full dependency information between variables
- Support for tracking taint through native code (see below)
Augur supportsthree methods for tracking taint across your application:
Boolean
: the simplest (and fastest) tracker you can use. During your application's runtime, it simply determines whether a value came fromany source. It doesn't keep track of which source it came from, or where the flow was introduced. This is not very useful in practice, because you will likely want to use...SourcedBoolean
: a more practical tracker. For each value in your program, Augur determines if it came from asource, and if so, which source and on what line the taint was introduced.Expression
: the most general tracker. In this mode, Augur will save all the information it finds during your application's runtime. For any given expression, its full set of dependent expressions is recorded. In other words, regardless your specified sources and sinks, Augur will saveevery flow betweenevery expression. Expect slowdowns and large output files (on the order of MBs).
The method you choose should be placed in yourspec.json
.
Modern JavaScript relies on a wide variety of native functions to improveits usability and performance. Common operations on data structures andutilities are now natively implemented in the VM, including array reduce,promises, and regular expressions.
Because native functions are so pervasive, accurately tracking taint in modernJavaScript requires an understanding of these functions. Our taint analysis onlyanalyzes JavaScript code, so we can't instrument the actual implementations ofthese native functions. Our taint analysis supports two ways of tracking taintthrough these native functions:
- implementingnative models. A native model is an implementation of anative function thatonly tracks taint. It doesn't have to perform anylogical calculations, it just has to inform theabstract machinewhere taint should flow as a result of the function call. These models are oftenmuch easier to implement than the functions themselves because data flows areoften more simple than the logic in a function. For examples of native models,look at
src/native/native.ts
. - usingpolyfills. Polyfills are implementations of native functions writtenin JavaScript itself. While polyfills are traditionally used for providingmissing functionality to older web browsers, they can also help the taintanalysis understand data-flow. Simply define the polyfill to use it.Note that Augur already defines a couple of its own polyfills in
src/native/polyfill.js
.
Augur normally runs your Node.js project in Docker. This is because NodeProfis difficult to install and configure locally. If you want to avoid usingDocker, you can install NodeProf locally and point Augur to the installation.
To install NodeProf locally, follow theadvanced installation instructions.
When using a manual installation, you will have to set environment variables:
NODEPROF_HOME
: pointing to your NodeProf advanced installationJAVA_HOME
: pointing to your JVM CI directory (not thebin
subdirectory)MX_HOME
: pointing to yourmx
installation
Example:
NODEPROF_HOME=/home/mwaldrich/workspace-nodeprof/nodeprof.js/
MX_HOME=/home/mwaldrich/mx
JAVA_HOME=/home/mwaldrich/openjdk1.8.0_172-jvmci-0.46
Augur will automatically use a local NodeProf installation if these environmentvariables are set; no flags or further configuration is needed.
If you're looking to dive into Augur's code, the structure and implementation ofthe analysis is documented withREADME
s in folders insidesrc
.
If you want to contribute to Augur, we recommend using JetBrain's WebStorm IDE.To get the project fully up and running, simply:
- follow the installation instructions in theGetting started section
- open the
augur
folder in WebStorm - execute the Run Configuration named
unit tests
Not Yet | In Progress | Done | Notes | |
---|---|---|---|---|
Variable read | x | |||
Variable write | x | |||
Property read | x | |||
Property write | x | |||
Unary expression | x | |||
Binary expression | x | |||
Implicit declaration ofthis | x | see #19 | ||
Function declaration | x | |||
Function arguments | x | |||
Variable assignment | x | see #20 | ||
Function call | x | see #21 | ||
Native functions | x | see #22 | ||
Async/await | X | see #29 | ||
Function returns | x | see #30 |
Ichnaea was evaluated against aset of 22 benchmarks. Here is a table showing how Augur performs on thesebenchmarks:
Correct Output | Incorrect Output | Old/Broken Code | |
---|---|---|---|
chook-growl-reporter-exec | x | ||
cocos-utils | x | ||
fish-exec | x | ||
git2json-exec | x | ||
gm-attack | x | ||
growl-exec | x | ||
libnotify-exec | x | ||
m-log-eval | x | ||
mixin-pro-eval | x | ||
modulify-eval | x | ||
mongo-parse-eval | x | ||
mongoosemask-eval | x | ||
mongoosify-eval | x | ||
node-os-utils | x | ||
node-wos | x | ||
office-converter | x | ||
os-uptime | x | ||
osenv | x | ||
pidusage-exec | x | ||
pomelo-monitor | x | ||
system-locale | x | ||
systeminformation | x |
The benchmarkpidusage-exec
crashes in modern JavaScript VMs.
The benchmarkm-log-eval
is not currently compatible with the tool due to amissing native function model.
Augur was written byMark Aldrich,Emily Shi,Alexi Turcotte,andFrank Tip.
Augur sits on top ofNodeProf,the dynamic analysis framework written byHaiyang Sun and others.
Augur also relies on Oracle'sGraalVMandGraalJS.
Continuous integration for Augur was designed and implemented byAdison Trueblood.
This research was supported by the National Science Foundation under NSF grantCCF-1715153 and REU supplement CCF-1930604.
Copyright (c) 2019-2022Programming Research Lab atNortheastern University. Augur is licensed under the UPL. See theLICENSE
filefor more information.
About
Performant taint analysis for Node.js