- Notifications
You must be signed in to change notification settings - Fork1.3k
Open
Description
Is your feature request related to a problem? Please describe
By default,imblearn
can handle 2D data (samples, features). I often work with time series and also try to classify time series. As a result, an imbalance between the classes can also occur. But I can not use the imblearn package as time series are 3-dimensional (e.g. samples, features, sequence_length)
Describe the solution you'd like
I would like to have the option to also pass 3D time series data to the many applicationsimblearn
offers. Currently, I wrote, e.g., my own oversampler, which I present as the "alternatives section". This code can of course be reused by the authors of imblearn for the described enhancement.
Describe alternatives you've considered
def oversample(x_train, y_train): slope_types = [x_train[y_train.to_numpy().flatten() == 0], x_train[y_train.to_numpy().flatten() == 1], x_train[y_train.to_numpy().flatten() == 2], x_train[y_train.to_numpy().flatten() == 3], x_train[y_train.to_numpy().flatten() == 4]] majority_class_length = max([len(i) for i in slope_types]) oversampled_x_data = np.empty([1, x_train.shape[1], x_train.shape[2]]) oversampled_y_data = np.empty([1]) for slope_number, slope_data in enumerate(slope_types): slope_data_length = len(slope_data) while slope_data_length < majority_class_length: idx = np.random.choice(np.arange(slope_data.shape[0])) drawn_sample = slope_data[idx].reshape(1, slope_data.shape[1], slope_data.shape[2]) oversampled_x_data = np.concatenate((oversampled_x_data, drawn_sample), axis=0) oversampled_y_data = np.concatenate((oversampled_y_data, np.array([slope_number])), axis=0) slope_data_length += 1 oversampled_x_data = oversampled_x_data[1:] oversampled_y_data = oversampled_y_data[1:] x_train = np.concatenate((x_train, oversampled_x_data), axis=0) y_train = pd.DataFrame(np.concatenate((y_train, oversampled_y_data.reshape(len(oversampled_y_data), 1)), axis=0), columns=['label']) return x_train, y_train
Metadata
Metadata
Assignees
Labels
No labels