Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[MRG] ENH: safe-level SMOTE#626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
laurallu wants to merge5 commits intoscikit-learn-contrib:master
base:master
Choose a base branch
Loading
fromlaurallu:safe-level

Conversation

laurallu
Copy link

@laurallulaurallu commentedNov 4, 2019
edited
Loading

This is an implementation of the safe-level SMOTE proposed in the following paper:

C. Bunkhumpornpat, K. Sinapiromsaran, C. Lursinsap, "Safe-level-SMOTE: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem," In: Theeramunkong T.,
Kijsirikul B., Cercone N., Ho TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science, vol 5476. Springer, Berlin, Heidelberg, 475-482, 2009.

Todo list:

  • add unit tests

@lgtm-com
Copy link

lgtm-combot commentedNov 4, 2019

This pull requestintroduces 1 alert when mergingbcc3069 into321b751 -view on LGTM.com

new alerts:

  • 1 for Redundant comparison

@codecov
Copy link

codecovbot commentedNov 4, 2019
edited
Loading

Codecov Report

Merging#626 intomaster willincrease coverage by0.05%.
The diff coverage is100%.

Impacted file tree graph

@@            Coverage Diff             @@##           master     #626      +/-   ##==========================================+ Coverage   97.93%   97.98%   +0.05%==========================================  Files          83       84       +1       Lines        4784     4911     +127     ==========================================+ Hits         4685     4812     +127  Misses         99       99
Impacted FilesCoverage Δ
imblearn/over_sampling/_smote.py97.73% <100%> (+0.52%)⬆️
imblearn/over_sampling/__init__.py100% <100%> (ø)⬆️
...blearn/over_sampling/tests/test_safelevel_smote.py100% <100%> (ø)

Continue to review full report at Codecov.

Legend -Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing data
Powered byCodecov. Last updateafbf781...866a04f. Read thecomment docs.

@glemaitre
Copy link
Member

Thanks for the contribution.

You will need to add tests to check that the new function is giving expected results.

@glemaitre
Copy link
Member

Oh I see that you mentioned it now :)

@laurallu
Copy link
Author

I just added some tests. Any suggestions?

@lgtm-com
Copy link

lgtm-combot commentedNov 6, 2019

This pull requestintroduces 1 alert when merging394d686 into321b751 -view on LGTM.com

new alerts:

  • 1 for Redundant comparison

sampling_strategy=BaseOverSampler._sampling_strategy_docstring,
random_state=_random_state_docstring,
)
class SLSMOTE(BaseSMOTE):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@glemaitreSafeLevelSMOTE vsSLSMOTE


self.m_neighbors = m_neighbors

def _assign_sl(self, nn_estimator, samples, target_class, y):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would use the name_assign_safe_levels unless it hurts the readability in the calling code.

@chkoar
Copy link
Member

Thanks!

I just added some tests. Any suggestions?

  1. Please use full names in variables, if you can. E.g.sl should besafe_lavels. Unless it hurts the readability.
  2. Could you add a section in thedocumentation?

@lgtm-com
Copy link

This pull requestintroduces 1 alert when mergingfd11e32 intoafbf781 -view on LGTM.com

new alerts:

  • 1 for Redundant comparison

@laurallulaurallu changed the title[WIP] ENH: safe-level SMOTE[MRG] ENH: safe-level SMOTENov 12, 2019
@laurallu
Copy link
Author

1. Please use full names in variables, if you can. E.g. [`sl`](https://github.com/scikit-learn-contrib/imbalanced-learn/blob/394d686364725763de8ea2cc3f504d8c08fe111a/imblearn/over_sampling/_smote.py#L1469) should be `safe_lavels`. Unless it hurts the readability.2. Could you add a section in the [documentation](https://github.com/scikit-learn-contrib/imbalanced-learn/blob/master/doc/over_sampling.rst)?

I've made changes accordingly. I think it's probably ready to go through a detailed review.

@glemaitre
Copy link
Member

I would suggest moving this implementation intosmote_variants. The idea behind this move is to benchmark the smote variants on a common benchmark on a large number of datasets and include in imbalanced-learn only the versions that show an advantage. You can see the discussion and contribute to it:https://github.com/gykovacs/smote_variants/issues/14

@laurallu would this strategy would be fine with you?

@chkoar
Copy link
Member

Since, Safe Level SMOTE exists there IMHO I believe that we should review@laurallu PR and merge it inimblearn.

@glemaitre
Copy link
Member

OK, let's do that. Let's open an issue to discuss the inclusion criterion to explain what we are expecting in the future. I will review this PR in a near future.

chkoar and laurallu reacted with thumbs up emoji

@laurallu
Copy link
Author

Thanks for pointing out the smote_variants to me. I will check it out. I would love to see the inclusion criterion too since I might code up more methods.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@chkoarchkoarchkoar left review comments

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@laurallu@glemaitre@chkoar

[8]ページ先頭

©2009-2025 Movatter.jp