Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A python library for decision tree visualization and model interpretation.

License

NotificationsYou must be signed in to change notification settings

techaddicted/dtreeviz

 
 

Repository files navigation

Description

A python library for decision tree visualization and model interpretation. Decision trees are the fundamental building block ofgradient boosting machines andRandom Forests(tm), probably the two most popular machine learning models for structured data. Visualizing decision trees is a tremendous aid when learning how these models work and when interpreting models. The visualizations are inspired by an educational animation byR2D3;A visual introduction to machine learning. Please seeHow to visualize decision trees for deeper discussion of our decision tree visualization library and the visual design decisions we made.

Currently dtreeviz supports:scikit-learn,XGBoost,Spark MLlib,LightGBM, andTensorflow. SeeInstallation instructions.

Authors

With major code and visualization clean up contributions done byMatthew Epland (@mepland).

Sample Visualizations

Tree visualizations

Prediction path explanations

Leaf information

Feature space exploration

Regression

Classification

Classification boundaries

As a utility function, dtreeviz providesdtreeviz.decision_boundaries() that illustrates one and two-dimensional feature space for classifiers, including colors that represent probabilities, decision boundaries, and misclassified entities. This method is not limited to tree models, by the way, and should work with any model that answers methodpredict_proba(). That means any model from scikit-learn should work (but we also made it work with Keras models that definepredict()). (As it does not work with trees specifically, the function does not use adaptors obtained fromdtreeviz.model().) Seeclassifier-decision-boundaries.ipynb.


Sometimes it's helpful to see animations that change some of the hyper parameters. If you look in notebookclassifier-boundary-animations.ipynb, you will see code that generates animations such as the following (animated png files):

Quick start

SeeInstallation instructions then take a look at the specificnotebooks for the supported ML library you're using:

To interopt with these different libraries, dtreeviz uses an adaptor object, obtained from functiondtreeviz.model(), to extract model information necessary for visualization. Given such an adaptor object, all of the dtreeviz functionality is available to you using the same programmer interface. The basic dtreeviz usage recipe is:

  1. Import dtreeviz and your decision tree library
  2. Acquire and load data into memory
  3. Train a classifier or regressor model using your decision tree library
  4. Obtain a dtreeviz adaptor model using
    viz_model = dtreeviz.model(your_trained_model,...)
  5. Call dtreeviz functions, such as
    viz_model.view() orviz_model.explain_prediction_path(sample_x)

Example

Here's a complete example Python file that displays the following tree in a popup window:

fromsklearn.datasetsimportload_irisfromsklearn.treeimportDecisionTreeClassifierimportdtreeviziris=load_iris()X=iris.datay=iris.targetclf=DecisionTreeClassifier(max_depth=4)clf.fit(X,y)viz_model=dtreeviz.model(clf,X_train=X,y_train=y,feature_names=iris.feature_names,target_name='iris',class_names=iris.target_names)v=viz_model.view()# render as SVG into internal objectv.show()# pop up windowv.save("/tmp/iris.svg")# optionally save as svg

In a notebook, you can render inline without callingshow(). Just callview():

viz_model.view()# in notebook, displays inline

Installation

Install anaconda3 on your system, if not already done.

You might verify that you do not have conda-installed graphviz-related packages installed because dtreeviz needs the pip versions; you can remove them from conda space by doing:

conda uninstall python-graphvizconda uninstall graphviz

To install (Python >=3.6 only), do this (from Anaconda Prompt on Windows!):

pip install dtreeviz# install dtreeviz for sklearnpip install dtreeviz[xgboost]# install XGBoost related dependencypip install dtreeviz[pyspark]# install pyspark related dependencypip install dtreeviz[lightgbm]# install LightGBM related dependencypip install dtreeviz[tensorflow_decision_forests]# install tensorflow_decision_forests related dependencypip install dtreeviz[all]# install all related dependencies

This should also pull in thegraphviz Python library (>=0.9), which we are using for platform specific stuff.

Limitations. Only svg files can be generated at this time, which reduces dependencies and dramatically simplifies install process.

Please emailTerence with any helpful notes on making dtreeviz work (better) on other platforms. Thanks!

For your specific platform, please see the following subsections.

Mac

Make sure to have the latest XCode installed and command-line tools installed. You can runxcode-select --install from the command-line to install those if XCode is already installed. You also have to sign the XCode license agreement, which you can do withsudo xcodebuild -license from command-line. The brew install shown next needs to build graphviz, so you need XCode set up properly.

You need the graphviz binary fordot. Make sure you have latest version (verified on 10.13, 10.14):

brew reinstall graphviz

Just to be sure, removedot from any anaconda installation, for example:

rm~/anaconda3/bin/dot

From command line, this command

dot -Tsvg

should work, in the sense that it just stares at you without giving an error. You can hit control-C to escape back to the shell. Make sure that you are using the rightdot as installed by brew:

$ which dot/usr/local/bin/dot$ ls -l$(which dot)lrwxr-xr-x  1 parrt  wheel  33 May 26 11:04 /usr/local/bin/dot@ -> ../Cellar/graphviz/2.40.1/bin/dot$

Limitations. Jupyter notebook has a bug where they do not show .svg files correctly, but Juypter Lab has no problem.

Linux (Ubuntu 18.04)

To get thedot binary do:

sudo apt install graphviz

Limitations. Theview() method works to pop up a new window and images appear inline for jupyter notebook but not jupyter lab (It gets an error parsing the SVG XML.) The notebook images also have a font substitution from the Arial we use and so some text overlaps. Only .svg files can be generated on this platform.

Windows 10

(Make sure topip install graphviz, which is common to all platforms, and make sure to do this from Anaconda Prompt on Windows!)

Download graphviz-2.38.msi and update yourPath environment variable. AddC:\Program Files (x86)\Graphviz2.38\bin to User path andC:\Program Files (x86)\Graphviz2.38\bin\dot.exe to System Path. It's windows so you might need a reboot after updating that environment variable. You should see this from the Anaconda Prompt:

(base) C:\Users\Terence Parr>where dotC:\Program Files (x86)\Graphviz2.38\bin\dot.exe

(Do not useconda install -c conda-forge python-graphviz as you get an old version ofgraphviz python library.)

Verify from the Anaconda Prompt that this works (capital-V not lowercase-v):

dot -V

If it doesn't work, you have aPath problem. I found the following test programs useful. The first one sees if Python can finddot:

importosimportsubprocessproc=subprocess.Popen(['dot','-V'])print(os.getenv('Path') )

The following version does the same thing except usesgraphviz Python libraries backend support utilities, which is what we use in dtreeviz:

importgraphviz.backendasbecmd= ["dot","-V"]stdout,stderr=be.run(cmd,capture_output=True,check=True,quiet=False)print(stderr )

If you are having issues with run command you can try copying the following files from:https://github.com/xflr6/graphviz/tree/master/graphviz.

Place them in the AppData\Local\Continuum\anaconda3\Lib\site-packages\graphviz folder.

Clean out thepycache directory too.

For graphviz windows install 8.0.5 and python interface v0.18+ :

importgraphviz.backendasbecmd= ["dot","-V"]stdout=be.execute.run_check(cmd,capture_output=True,check=True,quiet=False)print(stdout )

Jupyter Lab and Jupyter notebook both show the inline .svg images well.

Verify graphviz installation

Try making text filet.dot with contentdigraph T { A -> B } (paste that into a text editor, for example) and then running this from the command line:

dot -Tsvg -o t.svg t.dot

That should give a simplet.svg file that opens properly. If you get errors fromdot, it will not work from the dtreeviz python code. If it can't finddot then you didn't update yourPATH environment variable or there is some other install issue withgraphviz.

Limitations

Finally, don't use IE to view .svg files. Use Edge as they look much better. I suspect that IE is displaying them as a rasterized not vector images. Only .svg files can be generated on this platform.

Install dtreeviz locally

Make sure to follow the install guidelines above.

To push thedtreeviz library to your local egg cache (force updates) during development, do this (from anaconda prompt on Windows):

python setup.py install -f

E.g., on Terence's box, it add/Users/parrt/anaconda3/lib/python3.6/site-packages/dtreeviz-2.2.2-py3.6.egg.

Feedback

We welcome info from users on how they use dtreeviz, what features they'd like, etc... viaemail (to parrt) or via anissue.

Useful Resources

License

This project is licensed under the terms of the MIT license, seeLICENSE.

About

A python library for decision tree visualization and model interpretation.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook99.7%
  • Other0.3%

[8]ページ先頭

©2009-2025 Movatter.jp