C. Bunkhumpornpat, K. Sinapiromsaran, C. Lursinsap, "Safe-level-SMOTE: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem," In: Theeramunkong T.,
Kijsirikul B., Cercone N., Ho TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science, vol 5476. Springer, Berlin, Heidelberg, 475-482, 2009.

Todo list:

add unit tests

added safe-level-smote method

bcc3069

Copy link

lgtm-combot commentedNov 4, 2019

This pull requestintroduces 1 alert when mergingbcc3069 into321b751 -view on LGTM.com

new alerts:

1 for Redundant comparison

Copy link

codecovbot commentedNov 4, 2019•
edited
Loading

Codecov Report

Merging#626 intomaster willincrease coverage by0.05%.
The diff coverage is100%.

@@            Coverage Diff             @@##           master     #626      +/-   ##==========================================+ Coverage   97.93%   97.98%   +0.05%==========================================  Files          83       84       +1       Lines        4784     4911     +127     ==========================================+ Hits         4685     4812     +127  Misses         99       99

Impacted Files	Coverage Δ
imblearn/over_sampling/_smote.py	`97.73% <100%> (+0.52%)`	⬆️
imblearn/over_sampling/__init__.py	`100% <100%> (ø)`	⬆️
...blearn/over_sampling/tests/test_safelevel_smote.py	`100% <100%> (ø)`

Continue to review full report at Codecov.

Legend -Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing data
Powered byCodecov. Last updateafbf781...866a04f. Read thecomment docs.

Copy link

Member

glemaitre commentedNov 4, 2019

Thanks for the contribution.

You will need to add tests to check that the new function is giving expected results.

Copy link

Member

glemaitre commentedNov 4, 2019

Oh I see that you mentioned it now :)

unit tests added for safe-level SMOTE

394d686

Copy link

Author

laurallu commentedNov 6, 2019

I just added some tests. Any suggestions?

Copy link

lgtm-combot commentedNov 6, 2019

This pull requestintroduces 1 alert when merging394d686 into321b751 -view on LGTM.com

new alerts:

1 for Redundant comparison

chkoar reviewed

Nov 6, 2019

View reviewed changes

imblearn/over_sampling/_smote.py Outdated

		sampling_strategy=BaseOverSampler._sampling_strategy_docstring,
		random_state=_random_state_docstring,
		)
		class SLSMOTE(BaseSMOTE):

Copy link

Member

chkoarNov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@glemaitreSafeLevelSMOTE vsSLSMOTE

chkoar reviewed

Nov 6, 2019

View reviewed changes

imblearn/over_sampling/_smote.py Outdated


		self.m_neighbors = m_neighbors

		def _assign_sl(self, nn_estimator, samples, target_class, y):

Copy link

Member

chkoarNov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would use the name_assign_safe_levels unless it hurts the readability in the calling code.

Copy link

Member

chkoar commentedNov 6, 2019

Thanks!

I just added some tests. Any suggestions?

Please use full names in variables, if you can. E.g.sl should besafe_lavels. Unless it hurts the readability.
Could you add a section in thedocumentation?

glemaitre force-pushed themaster branch from65132db to68123d0Compare

November 8, 2019 22:54

laurallu added2 commits

November 11, 2019 13:53

fixed variable name, added doc and test

609c4fc

Merge remote-tracking branch 'upstream/master' into safe-level

fd11e32

Copy link

lgtm-combot commentedNov 11, 2019

This pull requestintroduces 1 alert when mergingfd11e32 intoafbf781 -view on LGTM.com

new alerts:

1 for Redundant comparison

=removed redundant lines

866a04f

laurallu changed the title~~[WIP] ENH: safe-level SMOTE~~[MRG] ENH: safe-level SMOTE

Nov 12, 2019

Copy link

Author

laurallu commentedNov 12, 2019

1. Please use full names in variables, if you can. E.g. [`sl`](https://github.com/scikit-learn-contrib/imbalanced-learn/blob/394d686364725763de8ea2cc3f504d8c08fe111a/imblearn/over_sampling/_smote.py#L1469) should be `safe_lavels`. Unless it hurts the readability.2. Could you add a section in the [documentation](https://github.com/scikit-learn-contrib/imbalanced-learn/blob/master/doc/over_sampling.rst)?

I've made changes accordingly. I think it's probably ready to go through a detailed review.

Copy link

Member

glemaitre commentedNov 17, 2019

I would suggest moving this implementation intosmote_variants. The idea behind this move is to benchmark the smote variants on a common benchmark on a large number of datasets and include in imbalanced-learn only the versions that show an advantage. You can see the discussion and contribute to it:https://github.com/gykovacs/smote_variants/issues/14

@laurallu would this strategy would be fine with you?

Copy link

Member

chkoar commentedNov 17, 2019

Since, Safe Level SMOTE exists there IMHO I believe that we should review@laurallu PR and merge it inimblearn.

Copy link

Member

glemaitre commentedNov 17, 2019

OK, let's do that. Let's open an issue to discuss the inclusion criterion to explain what we are expecting in the future. I will review this PR in a near future.

Copy link

Author

laurallu commentedNov 21, 2019

Thanks for pointing out the smote_variants to me. I will check it out. I would love to see the inclusion criterion too since I might code up more methods.