- Notifications
You must be signed in to change notification settings - Fork1.3k
DOC improve documentation of NCR#1017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:master
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
@glemaitre ready for review |
doc/under_sampling.rst Outdated
^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
The :class:`NeighbourhoodCleaningRule` is another "cleaning" algorithm. It removes | ||
samples from the majority class that are closest to the boundary with the minority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
samples from the majority class that are closest to the boundarywiththe minority | |
samples from the majority class that aretheclosest to the boundaryformed bythesamples of theminority class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't totally understand this sentence. Let me try a modification in a new commit.
doc/under_sampling.rst Outdated
The :class:`NeighbourhoodCleaningRule` expands on the cleaning performed by | ||
:class:`EditedNearestNeighbours` by eliminating additional majority class samples if | ||
they are among the 3 closest neighbours of a sample from the minority class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We have a parameter controlling the3-NN
.
they are among the3 closest neighbours of a sample from the minority class. | |
they are among the:math:`N` closest neighbours (i.e. using the parameter `n_neighbours`) of a sample from the minority class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Throughout the docs we are using K as the number of neighbours, not N. I guess the n in n_neighbours comes from n=number. I'd rather stick to K if that's alright with you, for consitency. I'll fix this in a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actually, I removed this sentence altogether as per below suggestion.
The procedure for the :class:`NeighbourhoodCleaningRule` is as follows: | ||
1. Remove observations from the majority class with edited nearest neighbors (ENN). | ||
2. Remove additional samples from the majority class if they are one of the k closest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Since we repeating the same sentence as above, I would remove the paragraph above and only go with the bullet point sequence.
doc/under_sampling.rst Outdated
To carry out step 2 there is one condition: a sample will only be removed if its class | ||
has a minimum number of observations. The minimum number of observations is regulated | ||
by the `threshold_cleaning` parameter. In the original article | ||
:cite:`laurikkala2001improving`, samples would be removed if the class had at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I would not go in details regarding the original paper but instead just phrase that we check that the number of samples in the class to under-sample is above the threshold times the number of samples in the minority class.
imblearn/under_sampling/_prototype_selection/_neighbourhood_cleaning_rule.py OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
imblearn/under_sampling/_prototype_selection/_neighbourhood_cleaning_rule.py OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
How can I check the linting error message? |
imblearn/under_sampling/_prototype_selection/_neighbourhood_cleaning_rule.py OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Thank you! |
Reword documentation and docstrings for the NCR.
Related to#854