- Notifications
You must be signed in to change notification settings - Fork166
jleetutorial/python-spark-streaming
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Project source code for James Lee's Aparch Spark with Python (Pyspark) course.
Tools like spark are incredibly useful for processing data that is continuously appended. The python bindings for Pyspark not only allow you to do that, but also allow you to combine spark streaming with other Python tools for Data Science and Machine learning. This course goes through some of the basics of using Apache Spark, as well as more advanced concepts like accumulators, combining Pyspark with Apache Kafka, using Pyspark with AWS tools like Kinesis, streaming data from sources like Twitter, and how to get the most out of the Structured Streaming paradigm in the recently-released Spark 2.3.0.
This course is a one-stop-shop for all your pyspark streaming education needs.
In this repo are the notebooks, data files, exercise files, and everything else you need to learn how to use the streaming capabilities of Pyspark.
Check out the full list of DevOps and Big Data courses that James and Tao teachhere
About
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.