- Notifications
You must be signed in to change notification settings - Fork0
This toolbox offers 13 wrapper feature selection methods (PSO, GA, GWO, HHO, BA, WOA, and etc.) with examples. It is simple and easy to implement.
License
NotificationsYou must be signed in to change notification settings
happyman11/Wrapper-Feature-Selection-Toolbox-Python
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
"Toward Talent Scientist: Sharing and Learning Together"---Jingwei Too
- This toolbox offers 13 wrapper feature selection methods
- The
Demo_PSO
provides an example of how to apply PSO on benchmark dataset - Source code of these methods are written based on pseudocode & paper
The main functionjfs
is adopted to perform feature selection. You may switch the algorithm by changing thepso
infrom FS.pso import jfs
toother abbreviations
- If you wish to use particle swarm optimization ( PSO ) then you may write
from FS.pso import jfs
- If you want to use differential evolution ( DE ) then you may write
from FS.de import jfs
feat
: feature vector matrix ( Instancex Features )label
: label matrix ( Instancex 1 )opts
: parameter settingsN
: number of solutions / population size (for all methods )T
: maximum number of iterations (for all methods )k
:k-value ink-nearest neighbor
Acc
: accuracy of validation modelfmdl
: feature selection model ( It contains several results )sf
: index of selected featuresnf
: number of selected featuresc
: convergence curve
import numpy as npimport pandas as pdfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.model_selection import train_test_splitfrom FS.pso import jfs # change this to switch algorithm import matplotlib.pyplot as plt# load datadata = pd.read_csv('ionosphere.csv')data = data.valuesfeat = np.asarray(data[:, 0:-1]) # feature vectorlabel = np.asarray(data[:, -1]) # label vector# split data into train & validation (70 -- 30)xtrain, xtest, ytrain, ytest = train_test_split(feat, label, test_size=0.3, stratify=label)fold = {'xt':xtrain, 'yt':ytrain, 'xv':xtest, 'yv':ytest}# parameterk = 5 # k-value in KNNN = 10 # number of particlesT = 100 # maximum number of iterationsw = 0.9c1 = 2c2 = 2opts = {'k':k, 'fold':fold, 'N':N, 'T':T, 'w':w, 'c1':c1, 'c2':c2}# perform feature selectionfmdl = jfs(feat, label, opts)sf = fmdl['sf']# model with selected featuresnum_train = np.size(xtrain, 0)num_valid = np.size(xtest, 0)x_train = xtrain[:, sf]y_train = ytrain.reshape(num_train) # Solve bugx_valid = xtest[:, sf]y_valid = ytest.reshape(num_valid) # Solve bugmdl = KNeighborsClassifier(n_neighbors = k) mdl.fit(x_train, y_train)# accuracyy_pred = mdl.predict(x_valid)Acc = np.sum(y_valid == y_pred) / num_validprint("Accuracy:", 100 * Acc)# number of selected featuresnum_feat = fmdl['nf']print("Feature Size:", num_feat)# plot convergencecurve = fmdl['c']curve = curve.reshape(np.size(curve,1))x = np.arange(0, opts['T'], 1.0) + 1.0fig, ax = plt.subplots()ax.plot(x, curve, 'o-')ax.set_xlabel('Number of Iterations')ax.set_ylabel('Fitness')ax.set_title('PSO')ax.grid()plt.show()
import numpy as npimport pandas as pdfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.model_selection import train_test_splitfrom FS.ga import jfs # change this to switch algorithm import matplotlib.pyplot as plt# load datadata = pd.read_csv('ionosphere.csv')data = data.valuesfeat = np.asarray(data[:, 0:-1])label = np.asarray(data[:, -1])# split data into train & validation (70 -- 30)xtrain, xtest, ytrain, ytest = train_test_split(feat, label, test_size=0.3, stratify=label)fold = {'xt':xtrain, 'yt':ytrain, 'xv':xtest, 'yv':ytest}# parameterk = 5 # k-value in KNNN = 10 # number of chromosomesT = 100 # maximum number of generationsCR = 0.8MR = 0.01opts = {'k':k, 'fold':fold, 'N':N, 'T':T, 'CR':CR, 'MR':MR}# perform feature selectionfmdl = jfs(feat, label, opts)sf = fmdl['sf']# model with selected featuresnum_train = np.size(xtrain, 0)num_valid = np.size(xtest, 0)x_train = xtrain[:, sf]y_train = ytrain.reshape(num_train) # Solve bugx_valid = xtest[:, sf]y_valid = ytest.reshape(num_valid) # Solve bugmdl = KNeighborsClassifier(n_neighbors = k) mdl.fit(x_train, y_train)# accuracyy_pred = mdl.predict(x_valid)Acc = np.sum(y_valid == y_pred) / num_validprint("Accuracy:", 100 * Acc)# number of selected featuresnum_feat = fmdl['nf']print("Feature Size:", num_feat)# plot convergencecurve = fmdl['c']curve = curve.reshape(np.size(curve,1))x = np.arange(0, opts['T'], 1.0) + 1.0fig, ax = plt.subplots()ax.plot(x, curve, 'o-')ax.set_xlabel('Number of Iterations')ax.set_ylabel('Fitness')ax.set_title('GA')ax.grid()plt.show()
- Python 3
- Numpy
- Pandas
- Scikit-learn
- Matplotlib
- Note that the methods are altered so that they can be used in feature selection tasks
- The extra parameters represent the parameter(s) other than population size and maximum number of iterations
- Click on the name of method to view how to set the extra parameter(s)
- Use the
opts
to set the specific parameters - If you do not set extra parameters then the algorithm will use default setting inhere
No. | Abbreviation | Name | Year | Extra Parameters |
---|---|---|---|---|
13 | hho | Harris Hawk Optimization | 2019 | No |
12 | ssa | Salp Swarm Algorithm | 2017 | No |
11 | woa | Whale Optimization Algorithm | 2016 | Yes |
10 | sca | Sine Cosine Algorithm | 2016 | Yes |
09 | ja | Jaya Algorithm | 2016 | No |
08 | gwo | Grey Wolf Optimizer | 2014 | No |
07 | fpa | Flower Pollination Algorithm | 2012 | Yes |
06 | ba | Bat Algorithm | 2010 | Yes |
05 | fa | Firefly Algorithm | 2010 | Yes |
04 | cs | Cuckoo Search Algorithm | 2009 | Yes |
03 | de | Differential Evolution | 1997 | Yes |
02 | pso | Particle Swarm Optimization | 1995 | Yes |
01 | ga | Genetic Algorithm | - | Yes |
About
This toolbox offers 13 wrapper feature selection methods (PSO, GA, GWO, HHO, BA, WOA, and etc.) with examples. It is simple and easy to implement.
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published
Languages
- Python100.0%