- Notifications
You must be signed in to change notification settings - Fork1.3k
Closed
Description
Describe the bug
When using SVMSMOTE on dataset which contains a minority class which has very few samples (may be < 10), it'll raise errorValueError: Found array with 0 sample(s) (shape=(0, 600)) while a minimum of 1 is required.
Steps/Code to Reproduce
fromcollectionsimportCounterfromsklearn.datasetsimportmake_classificationfromimblearn.over_samplingimportSVMSMOTE# doctest: +NORMALIZE_WHITESPACEX,y=make_classification(n_classes=3,class_sep=0,weights=[0.004,0.451,0.545],n_informative=3,n_redundant=0,flip_y=0,n_features=3,n_clusters_per_class=2,n_samples=1000,random_state=10)print('Original dataset shape %s'%Counter(y))sm=SVMSMOTE(random_state=42,k_neighbors=4)X_res,y_res=sm.fit_resample(X,y)print('Resampled dataset shape %s'%Counter(y_res))
Expected Results
Running without error
Actual Results
Original dataset shape Counter({2: 544, 1: 451, 0: 5})---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-78-8f5d2308c2bd> in <module>() 10 11 sm = SVMSMOTE(random_state=42, k_neighbors=4)---> 12 X_res, y_res = sm.fit_resample(X, y) 13 print('Resampled dataset shape %s' % Counter(y_res))~/anaconda3/lib/python3.6/site-packages/imblearn/base.py in fit_resample(self, X, y) 82 self.sampling_strategy, y, self._sampling_type) 83 ---> 84 output = self._fit_resample(X, y) 85 86 if binarize_y:~/anaconda3/lib/python3.6/site-packages/imblearn/over_sampling/_smote.py in _fit_resample(self, X, y) 530 def _fit_resample(self, X, y): 531 # print("_fit_resample X shape", X.shape)--> 532 return self._sample(X, y) 533 534 def _sample(self, X, y):~/anaconda3/lib/python3.6/site-packages/imblearn/over_sampling/_smote.py in _sample(self, X, y) 569 570 danger_bool = self._in_danger_noise(--> 571 self.nn_m_, support_vector, class_sample, y, kind='danger') 572 safety_bool = np.logical_not(danger_bool) 573 ~/anaconda3/lib/python3.6/site-packages/imblearn/over_sampling/_smote.py in _in_danger_noise(self, nn_estimator, samples, target_class, y, kind) 213 # print("kind", kind) 214 # print("_in_danger_noise samples shape", samples.shape)--> 215 x = nn_estimator.kneighbors(samples, return_distance=False)[:, 1:] 216 # print("x", x) 217 nn_label = (y[x] != target_class).astype(int)~/anaconda3/lib/python3.6/site-packages/sklearn/neighbors/base.py in kneighbors(self, X, n_neighbors, return_distance) 400 if X is not None: 401 query_is_train = False--> 402 X = check_array(X, accept_sparse='csr') 403 else: 404 query_is_train = True~/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator) 548 " minimum of %d is required%s." 549 % (n_samples, array.shape, ensure_min_samples,--> 550 context)) 551 552 if ensure_min_features > 0 and array.ndim == 2:ValueError: Found array with 0 sample(s) (shape=(0, 3)) while a minimum of 1 is required.
Versions
System:
python: 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0]
executable: /home/allenyl/anaconda3/bin/python
machine: Linux-4.15.0-112-generic-x86_64-with-debian-buster-sid
Python deps:
pip: 19.2.2
setuptools: 41.0.1
sklearn: 0.21.3
numpy: 1.15.1
scipy: 1.4.1
Cython: 0.28.2
pandas: 0.24.1
Metadata
Metadata
Assignees
Labels
No labels