- Notifications
You must be signed in to change notification settings - Fork4.6k
Notebooks and code for the book "Introduction to Machine Learning with Python"
amueller/introduction_to_ml_with_python
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository holds the code for the forthcoming book "Introduction to MachineLearning with Python" byAndreas Mueller andSarah Guido.You can find details about the book on theO'Reilly website.
The book requires the current stable version of scikit-learn, that is0.20.0. Most of the book can also be used with previous versions ofscikit-learn, though you need to adjust the import for everything from themodel_selection
module, mostlycross_val_score
,train_test_split
andGridSearchCV
.
This repository provides the notebooks from which the book is created, togetherwith themglearn
library of helper functions to create figures anddatasets.
For the curious ones, the cover depicts ahellbender.
All datasets are included in the repository, with the exception of the aclImdb dataset, which you can download fromthe page ofAndrew Maas. See the book for details.
If you getImportError: No module named mglearn
you can try to install mglearn into your python environment usingthe commandpip install mglearn
in your terminal or!pip install mglearn
in Jupyter Notebook.
Please note that the first print of the book is missing the following line when listing the assumed imports:
fromIPython.displayimportdisplay
Please add this line if you see an error involvingdisplay
.
The first print of the book used a function calledplot_group_kfold
.This has been renamed toplot_label_kfold
because of a rename inscikit-learn.
To run the code, you need the packagesnumpy
,scipy
,scikit-learn
,matplotlib
,pandas
andpillow
.Some of the visualizations of decision trees and neural networks structures also requiregraphviz
. The chapteron text processing also requiresnltk
andspacy
.
The easiest way to set up an environment is by installingAnaconda.
If you already have a Python environment set up, and you are using theconda
package manager, you can get all packages by running
conda install numpy scipy scikit-learn matplotlib pandas pillow graphviz python-graphviz
For the chapter on text processing you also need to installnltk
andspacy
:
conda install nltk spacy
If you already have a Python environment and are using pip to install packages, you need to run
pip install numpy scipy scikit-learn matplotlib pandas pillow graphviz
You also need to install the graphiz C-library, which is easiest using a package manager.If you are using OS X and homebrew, you canbrew install graphviz
. If you are on Ubuntu or debian, you canapt-get install graphviz
.Installing graphviz on Windows can be tricky and using conda / anaconda is recommended.For the chapter on text processing you also need to installnltk
andspacy
:
pip install nltk spacy
For the text processing chapter, you need to download the English language model for spacy using
python -m spacy download en
If you have errata for the (e-)book, please submit them via theO'Reilly Website.You can submit fixes to the code as pull-requests here, but I'd appreciate it if you would also submit them there, as this repository doesn't hold the"master notebooks".
About
Notebooks and code for the book "Introduction to Machine Learning with Python"
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors14
Uh oh!
There was an error while loading.Please reload this page.