- Notifications
You must be signed in to change notification settings - Fork3
sungchun12/schedule-python-script-using-Google-Cloud
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Use Case: Automates live Chicago traffic data and flows it into BigQuery for interactive real-time analysis
Technical Concept: Schedules a simple Python script to append data into BigQuery using Google Cloud's App Engine with a cron job.
Source Data:https://data.cityofchicago.org/Transportation/Chicago-Traffic-Tracker-Congestion-Estimates-by-Se/n4j6-wkkf
Architecture Reference:http://zablo.net/blog/post/python-apache-beam-google-dataflow-cron
Shout out to Mylin Ackerman for all his help. Saved me weeks of research with his personal touch.https://www.linkedin.com/in/mylin-ackermann-25a00445/
Check me out on LinkedIn:https://www.linkedin.com/in/sungwonchung1/
Setup Prerequisites:
- Signup for Google Cloud account and enable billing
- Enable BigQuery API, Stackdriver API, Google Cloud Deployment Manager V2 API, Google Compute Engine API
Order of Operations:
- Develop scripts with Google cloud shell or SDK
- Deploy on appengine
- Deploy cron job
- Check BigQuery
- Connect with dataviz tool such as Tableau
Development Instructions:
- Copy github repository into SDK or Google cloud shell(thankfully it has persistent storage, so you don't have to recopy the folder structure): git clonehttps://github.com/sungchun12/schedule-python-script-using-Google-Cloud.git
- Create BigQuery dataset: "chicago_traffic"
Deploy Instructions:
- Remember to putinit.py files into all local packages
- Change directory: cd ~/chicago-traffic
- Install all required packages into local lib folder: pip install -r requirements.txt -t lib
- To deploy App Engine app, run: gcloud app deploy app.yaml
- To deploy App Engine CRON, run: gcloud app deploy cron.yaml
Folder Structure:
init.py needed to properly deploy within App Engine
append_data.py - call the Chicago live traffic API and appends it into BigQuery
app.yaml - definition of Google App Engine application
appengine_config.py adds dependencies to locally installed packages (from lib folder)
cron.yaml - definition of Google App Engine CRON job
main.py - entry point for the web application and calls the function contained within "append_data.py"
requirements.txt - file for pip package manager, which contains list of all required packages to run the application and the pipeline
lib - local folder with all pip-installed packages from requirements.txt file
About
🕓 Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.