Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Feature engineering package with sklearn like functionality

License

NotificationsYou must be signed in to change notification settings

feature-engine/feature_engine

feature-engine logo

Open SourceGitHubGC.OS Sponsored
Tutorials!youtube
CodePyPI - Python VersionPyPIConda
DownloadsMonthly DownloadsDownloads
MetaGitHub contributorsfirst-timers-onlySponsorship
DocumentationRead the Docs
CitationDOIJOSS
TestingCircleCICodecovCode style: black

Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models.Feature-engine's transformers follow Scikit-learn's functionality with fit() and transform() methods to learn thetransforming parameters from the data and then transform it.

Feature-engine features in the following resources

Blogs about Feature-engine

Documentation

Pst! How did you find us?

We want to share Feature-engine with more people. It'd help us loads if you tell ushow you discovered us.

Then we'd know what we are doing right and which channels to use to share the love.

Please share your story by answering 1 quick questionat this link. 😃

Current Feature-engine's transformers include functionality for:

  • Missing Data Imputation
  • Categorical Encoding
  • Discretisation
  • Outlier Capping or Removal
  • Variable Transformation
  • Variable Creation
  • Variable Selection
  • Datetime Features
  • Time Series
  • Preprocessing
  • Scaling
  • Scikit-learn Wrappers

Imputation Methods

  • MeanMedianImputer
  • ArbitraryNumberImputer
  • RandomSampleImputer
  • EndTailImputer
  • CategoricalImputer
  • AddMissingIndicator
  • DropMissingData

Encoding Methods

  • OneHotEncoder
  • OrdinalEncoder
  • CountFrequencyEncoder
  • MeanEncoder
  • WoEEncoder
  • RareLabelEncoder
  • DecisionTreeEncoder
  • StringSimilarityEncoder

Discretisation methods

  • EqualFrequencyDiscretiser
  • EqualWidthDiscretiser
  • GeometricWidthDiscretiser
  • DecisionTreeDiscretiser
  • ArbitraryDiscreriser

Outlier Handling methods

  • Winsorizer
  • ArbitraryOutlierCapper
  • OutlierTrimmer

Variable Transformation methods

  • LogTransformer
  • LogCpTransformer
  • ReciprocalTransformer
  • ArcsinTransformer
  • PowerTransformer
  • BoxCoxTransformer
  • YeoJohnsonTransformer

Variable Scaling methods

  • MeanNormalizationScaler

Variable Creation:

  • MathFeatures
  • RelativeFeatures
  • CyclicalFeatures
  • DecisionTreeFeatures()

Feature Selection:

  • DropFeatures
  • DropConstantFeatures
  • DropDuplicateFeatures
  • DropCorrelatedFeatures
  • SmartCorrelationSelection
  • ShuffleFeaturesSelector
  • SelectBySingleFeaturePerformance
  • SelectByTargetMeanPerformance
  • RecursiveFeatureElimination
  • RecursiveFeatureAddition
  • DropHighPSIFeatures
  • SelectByInformationValue
  • ProbeFeatureSelection
  • MRMR

Datetime

  • DatetimeFeatures
  • DatetimeSubtraction
  • DatetimeOrdinal

Time Series

  • LagFeatures
  • WindowFeatures
  • ExpandingWindowFeatures

Pipelines

  • Pipeline
  • make_pipeline

Preprocessing

  • MatchCategories
  • MatchVariables

Wrappers:

  • SklearnTransformerWrapper

Installation

From PyPI using pip:

pip install feature_engine

From Anaconda:

conda install -c conda-forge feature_engine

Or simply clone it:

git clone https://github.com/feature-engine/feature_engine.git

Example Usage

>>>importpandasaspd>>>fromfeature_engine.encodingimportRareLabelEncoder>>>data= {'var_A': ['A']*10+ ['B']*10+ ['C']*2+ ['D']*1}>>>data=pd.DataFrame(data)>>>data['var_A'].value_counts()
Out[1]:A    10B    10C     2D     1Name: var_A, dtype: int64
>>>rare_encoder=RareLabelEncoder(tol=0.10,n_categories=3)>>>data_encoded=rare_encoder.fit_transform(data)>>>data_encoded['var_A'].value_counts()
Out[2]:A       10B       10Rare     3Name: var_A, dtype: int64

Find more examples in ourJupyter Notebook Galleryor in thedocumentation.

Contribute

Details about how to contribute can be found in theContribute Page

Briefly:

  • Fork the repo
  • Clone your fork into your local computer:
git clone https://github.com/<YOURUSERNAME>/feature_engine.git
  • navigate into the repo folder
cd feature_engine
  • Install Feature-engine as a developer:
pip install -e .
  • Optional: Create and activate a virtual environment with any tool of choice
  • Install Feature-engine developer dependencies:
pip install -e ".[tests]"
  • Create a feature branch with a meaningful name for your feature:
git checkout -b myfeaturebranch
  • Develop your feature, tests and documentation
  • Make sure the tests pass
  • Make a PR

Thank you!!

Documentation

Feature-engine documentation is built usingSphinx and is hosted onRead the Docs.

To build the documentation make sure you have the dependencies installed: from the root directory:

pip install -r docs/requirements.txt

Now you can build the docs using:

sphinx-build -b html docs build

License

The content of this repository is licensed under aBSD 3-Clause license.

Sponsor us

Sponsor us and support further ourmission to democratize machine learning and programming tools through open-sourcesoftware.


[8]ページ先頭

©2009-2025 Movatter.jp