- Notifications
You must be signed in to change notification settings - Fork1.3k
FIX BorddelineSMOTE-2 use the full dataset to generate new sample#1023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
self.nn_k_.fit(X_to_sample_from) | ||
nns = self.nn_k_.kneighbors(X_danger, return_distance=False)[:, 1:] | ||
X_new, y_new = self._make_samples( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This implementation does not fully reflect the description of Borderline smote 2 in the paper. The paper says that to create the samples by interpolation between the template of the minority and a neigbhour of the majority, it multiplies by a factor between 0 and 0.5 (instead of 0-1) to ensure the synthetic data is closer to the minority.
If I understand this code correctly, we are multiplying everything by a factor between 0 and 1. Pls correct me if I am wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nop, indeed. I forgot to look at the next page of the article. I will try to propose a fix.
Uh oh!
There was an error while loading.Please reload this page.
closes#861
Make sure that we use the full dataset to generate new samples in
BorderlineSMOTE
version 2.