Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

MAINT add support for feature_names_in_#959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes fromall commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletionsdoc/whats_new/v0.10.rst
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -22,6 +22,10 @@ Compatibility
- Add support for automatic parameters validation as in scikit-learn >= 1.2.
:pr:`955` by :user:`Guillaume Lemaitre <glemaitre>`.

- Add support for `feature_names_in_` as well as `get_feature_names_out` for
all samplers.
:pr:`959` by :user:`Guillaume Lemaitre <glemaitre>`.

Deprecation
...........

Expand Down
14 changes: 13 additions & 1 deletionimblearn/base.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -8,6 +8,12 @@

import numpy as np
from sklearn.base import BaseEstimator

try:
# scikit-learn >= 1.2
from sklearn.base import OneToOneFeatureMixin
except ImportError:
from sklearn.base import _OneToOneFeatureMixin as OneToOneFeatureMixin
from sklearn.preprocessing import label_binarize
from sklearn.utils.multiclass import check_classification_targets

Expand DownExpand Up@@ -133,7 +139,7 @@ class attribute, which is a dictionary `param_name: list of constraints`. See
)


class BaseSampler(SamplerMixin, _ParamsValidationMixin):
class BaseSampler(SamplerMixin,OneToOneFeatureMixin,_ParamsValidationMixin):
"""Base class for sampling algorithms.

Warning: This class should not be used directly. Use the derive classes
Expand DownExpand Up@@ -260,6 +266,12 @@ class FunctionSampler(BaseSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
sklearn.preprocessing.FunctionTransfomer : Stateless transformer.
Expand Down
6 changes: 6 additions & 0 deletionsimblearn/combine/_smote_enn.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -67,6 +67,12 @@ class SMOTEENN(BaseSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTETomek : Over-sample using SMOTE followed by under-sampling removing
Expand Down
6 changes: 6 additions & 0 deletionsimblearn/combine/_smote_tomek.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -66,6 +66,12 @@ class SMOTETomek(BaseSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTEENN : Over-sample using SMOTE followed by under-sampling using Edited
Expand Down
11 changes: 11 additions & 0 deletionsimblearn/metrics/pairwise.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -71,6 +71,17 @@ class ValueDifferenceMetric(BaseEstimator, _ParamsValidationMixin):
List of length `n_features` containing the conditional probabilities
for each category given a class.

n_features_in_ : int
Number of features in the input dataset.

.. versionadded:: 0.10

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
sklearn.neighbors.DistanceMetric : Interface for fast metric computation.
Expand Down
6 changes: 6 additions & 0 deletionsimblearn/over_sampling/_adasyn.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -73,6 +73,12 @@ class ADASYN(BaseOverSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTE : Over-sample using SMOTE.
Expand Down
6 changes: 6 additions & 0 deletionsimblearn/over_sampling/_random_over_sampler.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -76,6 +76,12 @@ class RandomOverSampler(BaseOverSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
BorderlineSMOTE : Over-sample using the borderline-SMOTE variant.
Expand Down
18 changes: 18 additions & 0 deletionsimblearn/over_sampling/_smote/base.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -264,6 +264,12 @@ class SMOTE(BaseSMOTE):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTENC : Over-sample using SMOTE for continuous and categorical features.
Expand DownExpand Up@@ -442,6 +448,12 @@ class SMOTENC(SMOTE):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTE : Over-sample using SMOTE.
Expand DownExpand Up@@ -759,6 +771,12 @@ class SMOTEN(SMOTE):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTE : Over-sample using SMOTE.
Expand Down
6 changes: 6 additions & 0 deletionsimblearn/over_sampling/_smote/cluster.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -93,6 +93,12 @@ class KMeansSMOTE(BaseSMOTE):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTE : Over-sample using SMOTE.
Expand Down
12 changes: 12 additions & 0 deletionsimblearn/over_sampling/_smote/filter.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -100,6 +100,12 @@ class BorderlineSMOTE(BaseSMOTE):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTE : Over-sample using SMOTE.
Expand DownExpand Up@@ -352,6 +358,12 @@ class SVMSMOTE(BaseSMOTE):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
SMOTE : Over-sample using SMOTE.
Expand Down
16 changes: 16 additions & 0 deletionsimblearn/tests/test_common.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -3,6 +3,7 @@
# Christos Aridas
# License: MIT

import warnings
from collections import OrderedDict

import numpy as np
Expand All@@ -19,6 +20,7 @@
from imblearn.under_sampling import NearMiss, RandomUnderSampler
from imblearn.utils.estimator_checks import (
_set_checking_parameters,
check_dataframe_column_names_consistency,
check_param_validation,
parametrize_with_checks,
)
Expand DownExpand Up@@ -92,3 +94,17 @@ def test_strategy_as_ordered_dict(Sampler):
X_res, y_res = sampler.fit_resample(X, y)
assert X_res.shape[0] == sum(strategy.values())
assert y_res.shape[0] == sum(strategy.values())


@pytest.mark.parametrize(
"estimator", _tested_estimators(), ids=_get_check_estimator_ids
)
def test_pandas_column_name_consistency(estimator):
_set_checking_parameters(estimator)
with ignore_warnings(category=(FutureWarning)):
with warnings.catch_warnings(record=True) as record:
check_dataframe_column_names_consistency(
estimator.__class__.__name__, estimator
)
for warning in record:
assert "was fitted without feature names" not in str(warning.message)
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -78,6 +78,12 @@ class ClusterCentroids(BaseUnderSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
EditedNearestNeighbours : Under-sampling by editing samples.
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -69,6 +69,12 @@ class CondensedNearestNeighbour(BaseCleaningSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
EditedNearestNeighbours : Undersample by editing samples.
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -76,6 +76,12 @@ class EditedNearestNeighbours(BaseCleaningSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
CondensedNearestNeighbour : Undersample by condensing samples.
Expand DownExpand Up@@ -251,6 +257,12 @@ class RepeatedEditedNearestNeighbours(BaseCleaningSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
CondensedNearestNeighbour : Undersample by condensing samples.
Expand DownExpand Up@@ -454,6 +466,12 @@ class without early stopping.

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
CondensedNearestNeighbour: Under-sampling by condensing samples.
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -67,6 +67,12 @@ class InstanceHardnessThreshold(BaseUnderSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
NearMiss : Undersample based on near-miss search.
Expand Down
6 changes: 6 additions & 0 deletionsimblearn/under_sampling/_prototype_selection/_nearmiss.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -72,6 +72,12 @@ class NearMiss(BaseUnderSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
RandomUnderSampler : Random undersample the dataset.
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -83,6 +83,12 @@ class NeighbourhoodCleaningRule(BaseCleaningSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
EditedNearestNeighbours : Undersample by editing noisy samples.
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -68,6 +68,12 @@ class OneSidedSelection(BaseCleaningSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
EditedNearestNeighbours : Undersample by editing noisy samples.
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -50,6 +50,12 @@ class RandomUnderSampler(BaseUnderSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
NearMiss : Undersample using near-miss samples.
Expand Down
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -48,6 +48,12 @@ class TomekLinks(BaseCleaningSampler):

.. versionadded:: 0.9

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during `fit`. Defined only when `X` has feature
names that are all strings.

.. versionadded:: 0.10

See Also
--------
EditedNearestNeighbours : Undersample by samples edition.
Expand Down
Loading

[8]ページ先頭

©2009-2025 Movatter.jp