grf-labs/policytreePublic

NotificationsYou must be signed in to change notification settings
Fork17
Star86

Policy learning via doubly robust empirical welfare maximization over trees

License

MIT license

86 stars 17 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
.github		.github
experiments		experiments
paper		paper
r-package/policytree		r-package/policytree
releases		releases
.gitattributes		.gitattributes
.gitignore		.gitignore
DEVELOPING.md		DEVELOPING.md
LICENSE		LICENSE
README.md		README.md
azure-pipelines.yml		azure-pipelines.yml

Repository files navigation

policytree

A package for learning simple rule-based policies, where the rule takes the form of a shallow decision tree. Applications include settings which require interpretable predictions, such as for example a medical treatment prescription. This package uses doubly robust reward estimates fromgrf to find a shallow, but globally optimal decision tree.

Some helpful links for getting started:

TheR package documentation contains usage examples and method references.
For community questions and answers around usage, see the GitHubissues page.
The packagesfastpolicytree andsparsepolicytree implement modified solvers that may offer improved performance on larger datasets.

Installation

The latest release of the package can be installed through CRAN:

install.packages("policytree")

To install the latest development version from source:

devtools::install_github("grf-labs/policytree",subdir="r-package/policytree")

Installing from source requires a C++ 11 compiler (on Windows Rtools is required as well) together with the R packagesRcpp andBH.

Multi-action policy learning example

library(policytree)n<-250p<-10X<-matrix(rnorm(n*p),n,p)W<- as.factor(sample(c("A","B","C"),n,replace=TRUE))Y<-X[,1]+X[,2]* (W=="B")+X[,3]* (W=="C")+ runif(n)multi.forest<-grf::multi_arm_causal_forest(X,Y,W)# Compute doubly robust reward estimates.Gamma.matrix<- double_robust_scores(multi.forest)head(Gamma.matrix)#              A          B           C# 1 -0.002612209 -0.1438422 -0.04243015# 2  0.417066177  0.4212708  1.04000173# 3  2.020414370  0.3963890  1.33038496# 4  1.193587749  1.7862142 -0.05668051# 5  0.808323778  0.5017521  1.52094053# 6 -0.045844471 -0.1460745 -1.56055025# Fit a depth 2 tree on a random training subset.train<- sample(1:n,200)opt.tree<- policy_tree(X[train, ],Gamma.matrix[train, ],depth=2)opt.tree# policy_tree object# Tree depth:  2# Actions:  1: A 2: B 3: C# Variable splits:# (1) split_variable: X3  split_value: 0.368037#   (2) split_variable: X2  split_value: -0.098143#     (4) * action: 1#     (5) * action: 2#   (3) split_variable: X2  split_value: 1.25697#     (6) * action: 3#     (7) * action: 2## Predict treatment on held out datahead(predict(opt.tree,X[-train, ]))#> [1] 2 3 1 2 3 3

Details

policy_tree(): fits a depthk tree by exhaustive search (Nxp features onNxd actions). The optimal tree maximizes the sum of rewards: let$\Gamma_i \in \mathbb R^d$ be a vector of unit-specific rewards for each action 1 to$d$ and$\pi(X_i) \in \{1, ..., d\}$ a mapping from covariates$X_i$ to action.policy_tree solves the following:

$$\pi^* = argmax_{\pi \in \Pi} \left[\frac{1}{n} \sum_{i=1}^{n} \Gamma_i(\pi(X_i)) \right],$$

where$\Pi$ is the class of depth-k decision trees. (hybrid_policy_tree() employs a mix between a optimal/greedy approach and can be used to fit deeper trees).

double_robust_scores(): computes doubly robust reward estimates for a subset ofgrf forest types.

Contributing

Contributions are welcome, please consult thedevelopment guide for details.

Funding

Development of policytree is supported by the National Science Foundation, the Sloan Foundation, the Office of Naval Research (Grant N00014-17-1-2131) and Schmidt Futures.

References

Susan Athey and Stefan Wager.Policy Learning With Observational Data.Econometrica 89.1 (2021): 133-161.[paper,arxiv]

Toru Kitagawa and Aleksey Tetenov.Who Should be Treated? Empirical Welfare Maximization Methods for Treatment Choice.Econometrica 86.2 (2018): 591-616.[paper]

Erik Sverdrup, Ayush Kanodia, Zhengyuan Zhou, Susan Athey, and Stefan Wager.policytree: Policy learning via doubly robust empirical welfare maximization over trees.Journal of Open Source Software, 5(50), 2020.[paper]

Zhengyuan Zhou, Susan Athey, and Stefan Wager.Offline Multi-Action Policy Learning: Generalization and Optimization. Operations Research 71.1 (2023).[paper,arxiv]

About

Policy learning via doubly robust empirical welfare maximization over trees

grf-labs.github.io/policytree/

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

policytree

Installation

Multi-action policy learning example

Details

Contributing

Funding

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors3

Uh oh!

Languages