- Notifications
You must be signed in to change notification settings - Fork0
Python Data Science Handbook: full text in Jupyter Notebooks
License
MIT, Unknown licenses found
Licenses found
GHpython/PythonDataScienceHandbook
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This repository contains the entirePython Data Science Handbook, in the form of (free!) Jupyter notebooks.
The book was written and tested with Python 3.5, though older Python versions (including Python 2.7) should work in nearly all cases.
The book introduces the core libraries essential for working with data in Python: particularlyIPython,NumPy,Pandas,Matplotlib,Scikit-Learn, and related packages.Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project,A Whirlwind Tour of Python: it's a fast-paced introduction to the Python language aimed at researchers and scientists.
The following listing links to the notebooks in this repository, rendered through thenbviewer service:
- Help and Documentation in IPython
- Keyboard Shortcuts in the IPython Shell
- IPython Magic Commands
- Input and Output History
- IPython and Shell Commands
- Errors and Debugging
- Profiling and Timing Code
- More IPython Resources
- Understanding Data Types in Python
- The Basics of NumPy Arrays
- Computation on NumPy Arrays: Universal Functions
- Aggregations: Min, Max, and Everything In Between
- Computation on Arrays: Broadcasting
- Comparisons, Masks, and Boolean Logic
- Fancy Indexing
- Sorting Arrays
- Structured Data: NumPy's Structured Arrays
- Introducing Pandas Objects
- Data Indexing and Selection
- Operating on Data in Pandas
- Handling Missing Data
- Hierarchical Indexing
- Combining Datasets: Concat and Append
- Combining Datasets: Merge and Join
- Aggregation and Grouping
- Pivot Tables
- Vectorized String Operations
- Working with Time Series
- High-Performance Pandas: eval() and query()
- Further Resources
- Simple Line Plots
- Simple Scatter Plots
- Visualizing Errors
- Density and Contour Plots
- Histograms, Binnings, and Density
- Customizing Plot Legends
- Customizing Colorbars
- Multiple Subplots
- Text and Annotation
- Customizing Ticks
- Customizing Matplotlib: Configurations and Stylesheets
- Three-Dimensional Plotting in Matplotlib
- Geographic Data with Basemap
- Visualization with Seaborn
- Further Resources
- What Is Machine Learning?
- Introducing Scikit-Learn
- Hyperparameters and Model Validation
- Feature Engineering
- In-Depth: Naive Bayes Classification
- In-Depth: Linear Regression
- In-Depth: Support Vector Machines
- In-Depth: Decision Trees and Random Forests
- In-Depth: Principal Component Analysis
- In-Depth: Manifold Learning
- In-Depth: k-Means Clustering
- In-Depth: Gaussian Mixture Models
- In-Depth: Kernel Density Estimation
- Application: A Face Detection Pipeline
- Further Machine Learning Resources
The code in the book was tested with Python 3.5, though most (but not all) will also work correctly with Python 2.7 and other older Python versions.
The packages I used to run the code in the book are listed inrequirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use).To install the requirements usingconda, run the following at the command-line:
$ conda install --file requirements.txtTo create a stand-alone environment namedPDSH with Python 3.5 and all the required package versions, run the following:
$ conda create -n PDSH python=3.5 --file requirements.txtYou can read more about using conda environments in theManaging Environments section of the conda documentation.
The code in this repository, including all code samples in the notebooks listed above, is released under theMIT license. Read more at theOpen Source Initiative.
The text content of the book is released under theCC-BY-NC-ND license. Read more atCreative Commons.
About
Python Data Science Handbook: full text in Jupyter Notebooks
Resources
License
MIT, Unknown licenses found
Licenses found
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Languages
- Jupyter Notebook99.7%
- Other0.3%
