sklearn.datasets#
Utilities to load popular datasets and artificial data generators.
User guide. See theDataset loading utilities section for further details.
Loaders#
Delete all the content of the data home cache. | |
Dump the dataset in svmlight / libsvm file format. | |
Load the filenames and data from the 20 newsgroups dataset (classification). | |
Load and vectorize the 20 newsgroups dataset (classification). | |
Load the California housing dataset (regression). | |
Load the covertype dataset (classification). | |
Fetch a file from the web if not already present in the local folder. | |
Load the kddcup99 dataset (classification). | |
Load the Labeled Faces in the Wild (LFW) pairs dataset (classification). | |
Load the Labeled Faces in the Wild (LFW) people dataset (classification). | |
Load the Olivetti faces data-set from AT&T (classification). | |
Fetch dataset from openml by name or dataset id. | |
Load the RCV1 multilabel dataset (classification). | |
Loader for species distribution dataset from Phillips et. | |
Return the path of the scikit-learn data directory. | |
Load and return the breast cancer Wisconsin dataset (classification). | |
Load and return the diabetes dataset (regression). | |
Load and return the digits dataset (classification). | |
Load text files with categories as subfolder names. | |
Load and return the iris dataset (classification). | |
Load and return the physical exercise Linnerud dataset. | |
Load the numpy array of a single sample image. | |
Load sample images for image manipulation. | |
Load datasets in the svmlight / libsvm format into sparse CSR matrix. | |
Load dataset from multiple files in SVMlight format. | |
Load and return the wine dataset (classification). |
Sample generators#
Generate a constant block diagonal structure array for biclustering. | |
Generate isotropic Gaussian blobs for clustering. | |
Generate an array with block checkerboard structure for biclustering. | |
Make a large circle containing a smaller circle in 2d. | |
Generate a random n-class classification problem. | |
Generate the "Friedman #1" regression problem. | |
Generate the "Friedman #2" regression problem. | |
Generate the "Friedman #3" regression problem. | |
Generate isotropic Gaussian and label samples by quantile. | |
Generate data for binary classification used in Hastie et al. 2009, Example 10.2. | |
Generate a mostly low rank matrix with bell-shaped singular values. | |
Make two interleaving half circles. | |
Generate a random multilabel classification problem. | |
Generate a random regression problem. | |
Generate an S curve dataset. | |
Generate a signal as a sparse combination of dictionary elements. | |
Generate a sparse symmetric definite positive matrix. | |
Generate a random regression problem with sparse uncorrelated design. | |
Generate a random symmetric, positive-definite matrix. | |
Generate a swiss roll dataset. |