Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Automates KBO data collection and deployment with Airflow.

License

NotificationsYou must be signed in to change notification settings

leewr9/kbo-data-pipeline

Repository files navigation

This repository automates the collection and deployment of KBO data using Apache Airflow. It manages the flow of data from collection to visualization in the Data Portal.

Deployment

This pipeline runs onApache Airflow and is deployed usingDocker Compose. To set up and run the pipeline, ensure Docker is installed and configured properly.

Running Locally

To run the pipeline locally using Docker Compose:

  1. Clone this repository and initialize submodules:
    git clone --recurse-submodules https://github.com/leewr9/kbo-data-pipeline.gitcd kbo-data-pipeline
  2. Ensure that your GCP service account key is placed in theconfig folder and renamed tokey.json:
    mv your-service-account-key.json config/key.json
  3. Start the Airflow services using Docker Compose:
    docker-compose up -d
  4. Access the Airflow web UI athttp://localhost:8080
    • Login withUsername:admin,Password:admin

DAGs Overview

The following DAGs are currently implemented:

  • fetch_kbo_games_daily - Runs daily at00:00, parsing the latest KBO game results.
  • fetch_kbo_players_weekly - Runs everySunday at 00:00, parsing player records up to the current week.
  • fetch_kbo_schedules_weekly - Runs everySunday at 00:00, parsing the schedule for the upcoming week.
  • fetch_kbo_historical_data - Runs everyyear on January 1st at 00:00, parsing the schedule for the upcoming year.

Data Storage Structure

The collected data is stored inGoogle Cloud Storage (GCS) under thekbo-data bucket with the following structure:

  • schedules/
    • weekly/ (Upcoming game schedules, weekly basis)
    • historical/ (Past game schedules by year)
  • games/
    • daily/ (Game details collected daily)
    • historical/ (Historical game details by year)
  • players/
    • daily/ (Player statistics per game)
    • weekly/ (Aggregated player statistics per week)
    • historical/ (Past player statistics by year)

Data Collection

The parsing modules are managed through thekbo-data-collector repository, which is included as aGit submodule in this project.

License

This project is licensed under theMIT License. See theLICENSE file for details.

About

Automates KBO data collection and deployment with Airflow.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp