Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Wind Power Forecasting using Machine Learning techniques.

License

NotificationsYou must be signed in to change notification settings

vchaparro/wind-power-forecasting

Repository files navigation

This repository contains the source code of my Final Master's degree project inDecision Systems Engineering, titledWind Power Forecasting using Machine Learning techniques, coursed inRey Juan Carlos University. It is based on theData Science challenge posed by theCompagnie nationale du Rhône.

For further information, you can read the master's thesishere.

Introduction

This application is intended to be a flexible and configurable tool in order to easily build and analyze models for this forecasting problem. It is based onKedro API for the sake of applying software engineering best practices to data and machine-learning pipelines.MLflow tracking is used to record and query experiments (code, data, config, and results).

Instalation

The packages to re-create the necessary conda environment are listed in./requirements.txt.

Implemented pipelines

The main pipelines implemented are:

  1. Prepare data for EDA (eda). Transforms raw data into a proper format for Exploratory Data Analisys.
  2. Data engineering (de). Gets the data ready to be consumed by Machine Learning algorithms.
  3. Feature engineering (fe). Allows to explore and add new features to the data sets.
  4. Modeling (mdl). Trains the selected algorithm from among the following: MARS, KNN, RF, SVM. It also optimizes model hyperparameters and make predictions on the test set.

There are other two additional pipelines:

  1. CNR pipeline. It contains several subpipelines to get predictions and submission file for the CNR Data Science Challenge.
  2. Neural Networks. In progress ...

Configuration files

There are configuration files for every pipeline consisting ofprameters.yml andcatalog.yml files. The first one contains all the parameters required for the pipeline run. The second is the project-shareable Data Catalog. It's a registry of all data sources available for use by the project and it manages loading and saving of data. Both configuration files are located atconf/base.

CLI commands

As a kedro application, the CLI can be used to run pipelines, among all other options you can check in kedro documentation. To run the main pipelines of this project these are some basic command examples, choosing the Wind Farm (wf) and the algorithm (alg) to build the model:

  1. Prepare data for EDA:kedro run --pipeline eda --params wf:WF1
  2. Data engineering:kedro run --pipeline de --params wf:WF1
  3. Feature engineering:kedro run --pipeline fe --params wf:WF1,max_k_bests:3
  4. Modeling:kedro run --piepeline mdl --params wf:WF1,alg:KNN

You can overwrite any parameter value defined in parameter configuration files, as well as the the data set used as the first input whenever it is defined in any of the existing data catalogs.

Important: It's necessary to put raw data indata/01_raw/. Raw data is availablehere (free registration for the challenge is required).

Pipeline visualization

Using the pluginkedro-viz (need to be installed) by runningkedro viz, you'll visualize data and machine-learning pipelines. For instance, this is the visualization of the data enegineering pipeline:

Other useful commands

  • Mlflow tracking ui:kedro mlflow ui. It serves the tracking tool as a web on localhost (by default port 5000)
  • Jupyter notebook:kedro jupyter notebook. It launches jupyter notebook loading all the kedro context variables so you can easily access pipelines, data catalogs, parameters and many other useful stuff from your notebook.

To usemlflow ui you need to install the pluginkedro-mlflow.

License: CC BY 4.0

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp