Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

🕓 Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job

NotificationsYou must be signed in to change notification settings

sungchun12/schedule-python-script-using-Google-Cloud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

Use Case: Automates live Chicago traffic data and flows it into BigQuery for interactive real-time analysis

Technical Concept: Schedules a simple Python script to append data into BigQuery using Google Cloud's App Engine with a cron job.

Source Data:https://data.cityofchicago.org/Transportation/Chicago-Traffic-Tracker-Congestion-Estimates-by-Se/n4j6-wkkf

Architecture Reference:http://zablo.net/blog/post/python-apache-beam-google-dataflow-cron

Shout out to Mylin Ackerman for all his help. Saved me weeks of research with his personal touch.https://www.linkedin.com/in/mylin-ackermann-25a00445/

Check me out on LinkedIn:https://www.linkedin.com/in/sungwonchung1/

Setup Prerequisites:

  1. Signup for Google Cloud account and enable billing
  2. Enable BigQuery API, Stackdriver API, Google Cloud Deployment Manager V2 API, Google Compute Engine API

Order of Operations:

  1. Develop scripts with Google cloud shell or SDK
  2. Deploy on appengine
  3. Deploy cron job
  4. Check BigQuery
  5. Connect with dataviz tool such as Tableau

Development Instructions:

  1. Copy github repository into SDK or Google cloud shell(thankfully it has persistent storage, so you don't have to recopy the folder structure): git clonehttps://github.com/sungchun12/schedule-python-script-using-Google-Cloud.git
  2. Create BigQuery dataset: "chicago_traffic"

Deploy Instructions:

  1. Remember to putinit.py files into all local packages
  2. Change directory: cd ~/chicago-traffic
  3. Install all required packages into local lib folder: pip install -r requirements.txt -t lib
  4. To deploy App Engine app, run: gcloud app deploy app.yaml
  5. To deploy App Engine CRON, run: gcloud app deploy cron.yaml

Folder Structure:

alt text

init.py needed to properly deploy within App Engine

append_data.py - call the Chicago live traffic API and appends it into BigQuery

app.yaml - definition of Google App Engine application

appengine_config.py adds dependencies to locally installed packages (from lib folder)

cron.yaml - definition of Google App Engine CRON job

main.py - entry point for the web application and calls the function contained within "append_data.py"

requirements.txt - file for pip package manager, which contains list of all required packages to run the application and the pipeline

lib - local folder with all pip-installed packages from requirements.txt file

About

🕓 Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors2

  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp