Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Streaming event pipeline with Apache Kafka and its ecosystem to simulate and display the status of train lines in real time

NotificationsYou must be signed in to change notification settings

milamarcan/streaming_etl_optimize_public_transportation

Repository files navigation

Project overview

Building streaming event pipeline around Apache Kafka and its ecosystem (REST Proxy, Kafka Connect), that allows to simulate and display the status of train lines in real time.

Project assignment

You are constructing a streaming event pipeline around Apache Kafka and its ecosystem. Using public data from theChicago Transit Authority you will construct an event pipeline around Kafka that allows us to simulate and display the status of train lines in real time.When the project is completed, you will be able to monitor a website to watch trains move from station to station.

Project architecture

Architecture for the project

Running the project

Pre-requisites

  • InstallDocker, make sure Docker Compose is installed too
  • If you are on Windows machine, install: Windows Subsystem for Linux (WSL) version 2link
  • Install Ubuntu 20.04
  • If you are on Windows machine, also installlibrdkafka librarylink, make sure to install what is needed fromthis link too
  • Inside Ubuntu, in a terminal instance rundocker-compose up
    • You can check status of the environment by runningdocker-compose ps in a new terminal instance

How to run the project

There are 2 pieces of the simulation,producer andconsumer, each of them can be run separately. To run end-to-end simulation, run all the pieces together(in different terminal windows):

  1. To run theproducer:

    • cd producers
    • virtualenv venv
    • . venv/bin/activate
    • pip install -r requirements.txt
    • python simulation.pyHitCtrl+C at any time to exit.
  2. To run the Faust Stream Processing Application:

    • cd consumers
    • virtualenv venv
    • . venv/bin/activate
    • pip install -r requirements.txt
    • faust -A faust_stream worker -l info
  3. To run the KSQL Creation Script:

    • cd consumers
    • virtualenv venv
    • . venv/bin/activate
    • pip install -r requirements.txt
    • python ksql.py
  4. To run theconsumer:

    • cd consumers
    • virtualenv venv
    • . venv/bin/activate
    • pip install -r requirements.txt
    • cpython server.pyHitCtrl+C at any time to exit.

How to stop the project

  • To stop Docker rundocker-compose stop in a terminal instance
  • To clean up the containers to reclaim disk space, rundocker-compose rm -v

Output

The output of the project looks like thisProject output

About

Streaming event pipeline with Apache Kafka and its ecosystem to simulate and display the status of train lines in real time

Topics

Resources

Stars

Watchers

Forks


[8]ページ先頭

©2009-2025 Movatter.jp