validate_data#

sklearn.utils.validation.validate_data(_estimator,/,X='no_validation',y='no_validation',reset=True,validate_separately=False,skip_check_array=False,**check_params)[source]#

Validate input data and set or check feature names and counts of the input.

This helper function should be used in an estimator that requires inputvalidation. This mutates the estimator and sets then_features_in_ andfeature_names_in_ attributes ifreset=True.

Added in version 1.6.

Parameters:
_estimatorestimator instance

The estimator to validate the input for.

X{array-like, sparse matrix, dataframe} of shape (n_samples, n_features), default=’no validation’

The input samples.If'no_validation', no validation is performed onX. This isuseful for meta-estimator which can delegate input validation totheir underlying estimator(s). In that casey must be passed andthe only acceptedcheck_params aremulti_output andy_numeric.

yarray-like of shape (n_samples,), default=’no_validation’

The targets.

  • IfNone,check_array is called onX. Ifthe estimator’srequires_y tag is True, then an error will be raised.

  • If'no_validation',check_array is calledonX and the estimator’srequires_y tag is ignored. This is a defaultplaceholder and is never meant to be explicitly set. In that caseX must bepassed.

  • Otherwise, onlyy with_check_y or bothX andy are checked witheithercheck_array orcheck_X_y depending onvalidate_separately.

resetbool, default=True

Whether to reset then_features_in_ attribute.If False, the input will be checked for consistency with dataprovided when reset was last True.

Note

It is recommended to callreset=True infit and in the firstcall topartial_fit. All other methods that validateXshould setreset=False.

validate_separatelyFalse or tuple of dicts, default=False

Only used ify is notNone.IfFalse, callcheck_X_y. Else, it must be a tuple ofkwargs to be used for callingcheck_array onX andyrespectively.

estimator=self is automatically added to these dicts to generatemore informative error message in case of invalid input data.

skip_check_arraybool, default=False

IfTrue,X andy are unchanged and onlyfeature_names_in_ andn_features_in_ are checked. Otherwise,check_arrayis called onX andy.

**check_paramskwargs

Parameters passed tocheck_array orcheck_X_y. Ignored if validate_separatelyis not False.

estimator=self is automatically added to these params to generatemore informative error message in case of invalid input data.

Returns:
out{ndarray, sparse matrix} or tuple of these

The validated input. A tuple is returned if bothX andy arevalidated.

Gallery examples#

Release Highlights for scikit-learn 1.6

Release Highlights for scikit-learn 1.6