quantile_transform #

sklearn.preprocessing.quantile_transform(X,*,axis=0,n_quantiles=1000,output_distribution='uniform',ignore_implicit_zeros=False,subsample=100000,random_state=None,copy=True)[source]#

Transform features using quantiles information.

This method transforms the features to follow a uniform or a normaldistribution. Therefore, for a given feature, this transformation tendsto spread out the most frequent values. It also reduces the impact of(marginal) outliers: this is therefore a robust preprocessing scheme.

The transformation is applied on each feature independently. First anestimate of the cumulative distribution function of a feature isused to map the original values to a uniform distribution. The obtainedvalues are then mapped to the desired output distribution using theassociated quantile function. Features values of new/unseen data that fallbelow or above the fitted range will be mapped to the bounds of the outputdistribution. Note that this transform is non-linear. It may distort linearcorrelations between variables measured at the same scale but rendersvariables measured at different scales more directly comparable.

See also

QuantileTransformer: Performs quantile-based scaling using the Transformer API (e.g. as part of a preprocessingPipeline).
power_transform: Maps data to a normal distribution using a power transformation.
scale: Performs standardization that is faster, but less robust to outliers.
robust_scale: Performs robust standardization that removes the influence of outliers but does not put outliers and inliers on the same scale.

Notes

NaNs are treated as missing values: disregarded in fit, and maintained intransform.

Warning

Risk of data leak

Do not usequantile_transform unlessyou know what you are doing. A common mistake is to apply itto the entire databefore splitting into training andtest sets. This will bias the model evaluation becauseinformation would have leaked from the test set to thetraining set.In general, we recommend usingQuantileTransformer within aPipeline in order to prevent most risks of dataleaking:pipe=make_pipeline(QuantileTransformer(),LogisticRegression()).

For a comparison of the different scalers, transformers, and normalizers,see:Compare the effect of different scalers on data with outliers.

Examples

>>>importnumpyasnp>>>fromsklearn.preprocessingimportquantile_transform>>>rng=np.random.RandomState(0)>>>X=np.sort(rng.normal(loc=0.5,scale=0.25,size=(25,1)),axis=0)>>>quantile_transform(X,n_quantiles=10,random_state=0,copy=True)array([...])

Gallery examples#

Effect of transforming the targets in regression model

On this page

This Page

Show Source

Movatterモバイル変換

quantile_transform#

Gallery examples#

This Page

quantile_transform #