- Notifications
You must be signed in to change notification settings - Fork1.3k
Closed
Description
Describe the bug
RandomUnderSampler
performs checks onX
argument, which are unnecessary, as they do not affect the choice of resampled indices.
This is an issue if I pass pandas DataFrame.
The exception is not risen if I pass a numpy object with timestamps.
Steps/Code to Reproduce
from datetime import datetimeimport pandas as pddf = pd.DataFrame({"label": [0,0,0,1], "td": [datetime.now()]*4})rus = imblearn.under_sampling.RandomUnderSampler(random_state=2342374)rus.fit_resample(df, df.label)
Expected Results
No error is thrown.
Actual Results
TypeError: The DType <class 'numpy.dtype[int64]'> could not be promoted by <class 'numpy.dtype[datetime64]'>. This means that no common DType exists for the given inputs. For example they cannot be stored in a single array unless the dtype is `object`. The full list of DTypes is: (<class 'numpy.dtype[int64]'>, <class 'numpy.dtype[datetime64]'>)
Versions
Linux-5.15.0-60-generic-x86_64-with-glibc2.35Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]NumPy 1.24.1SciPy 1.9.3Scikit-Learn 1.2.1Imbalanced-Learn 0.10.0
My current workaround
from datetime import datetimeimport pandas as pddf = pd.DataFrame({"label": [0,0,0,1], "td": [datetime.now()]*4})rus = imblearn.under_sampling.RandomUnderSampler(random_state=2342374)downsabpled_df, _ = rus.fit_resample(df.to_numpy(), df.label)downsabpled_df = pd.DataFrame(downsabpled_df, columns=df.columns)
P.S. Huge thanks for this useful library.
Metadata
Metadata
Assignees
Labels
No labels