- Notifications
You must be signed in to change notification settings - Fork561
A list of useful resources to learn Data Engineering from scratch
NotificationsYou must be signed in to change notification settings
adilkhash/Data-Engineering-HowTo
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
- The AI Hierarchy of Needs
- The Rise of Data Engineer
- The Downfall of the Data Engineer
- A Beginner’s Guide to Data Engineering
- Functional Data Engineering — a modern paradigm for batch data processing
- How to become a Data EngineerRu,En
- Introduction to Apache AirflowRu,En
- Apache Airflow Alternatives
- Data Engineering Principles - Build frameworks not pipelines by Gatis Seja
- Functional Data Engineering - A Set of Best Practices by Maxime Beauchemin
- Advanced Data Engineering Patterns with Apache Airflow by Maxime Beauchemin
- Creating a Data Engineering Culture by Jesse Anderson
- Streaming 101: Hello Streaming by Josh Fischer
- Algorithmic Toolbox in Russian
- Data Structures in Russian
- Data Structures & Algorithms Specialization on Coursera
- Algorithms Specialization from Stanford on Coursera
- Comprehensive SQL Tutorial by Mode Analytics
- SQL Practice on Leetcode
- Modern SQL a website about modern SQL syntax
- Introduction to Window FunctionsEn,Ru
- Scala School by Twitter
- Fluent Python intermediate level book about Python
- Intro to Scala in Russian on Stepik by Tinkoff Bank
- The Hitchhiker’s Guide to Python by Kenneth Reitz & Tanya Schlusser
- Learn Python 3 The Hard Way by Zed A. Shaw
- Intro to Database Systems by Carnegie Mellon University
- Advanced Database Systems by Carnegie Mellon University
- On Disk IO
- Distributed systems for fun and profit by Mikito Takada
- Distributed Systems by Maarten van Steen & Andrew S. Tanenbaum
- CSE138: Distributed Systems by Lindsey Kuper
- CS 436: Distributed Computer Systems by University of Waterloo
- MIT 6.824: Distributed Systems by Robert Morris from MIT
- Distributed consensus reading list maintained by Heidi Howard from University of Cambridge
- Design Data-Intensive Applications by Martin Kleppmann
- Fundamentals of Data Engineering: Plan and Build Robust Data Systems by Joe Reis & Matt Housley
- Introduction to Algorithms by Thomas Cormen
- The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
- Star Schema The Complete Reference
- Database Internals: A Deep Dive into How Distributed Data Systems Work
- Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing
- A Philosophy of Software Design
- Grokking Streaming Systems by Josh Fischer & Ning Wang
- Guide to High Performance Distributed Computing by K.G. Srinivasa & Anil Kumar Muppalla
- Data Pipelines with Apache Airflow by Bas P. Harenslak and Julian Rutger de Ruiter
- Data Engineering on Google Cloud Platform Specialization by Google
- Data Engineer Nanodegree by Udacity
- Data Engineering with Python by DataCamp
- Martin Kleppmann author of Designing Data-Intensive Application
- BaseDS by Vaidehi Joshi about Distributed Systems
- Apache Airflow is a platform to programmatically author, schedule and monitor workflows in Python
- Apache Spark is a unified analytics engine for large-scale data processing
- Apache Kafka is a distributed streaming platform
- Luigi is a Python package that helps you build complex pipelines of batch jobs.
- Dagster.io is a system for building modern data applications.
- Prefect includes everything you need to create and run data applications.
- Metaflow build and manage real-life data science projects with ease
- lakeFS build repeatable, atomic and versioned data lake operations – from complex ETL jobs to data science and analytics.
- data Engineering - telegram chat about data engineering
- Data Engineering Subreddit - subreddit about data engineering
- DataEng Telegram channel - Telegram channel about data engineering (rus/eng)
- Data Engineering Weekly
- SF Data Weekly - A weekly email of useful links for people interested in building data platforms
- Data Elixir - Data Elixir is an email newsletter that keeps you on top of the tools and trends in Data Science.
- Data Governance, Privacy and Security - DbAdmin News is a news letter on the technology behind Data Governance, Security and Privacy
About
A list of useful resources to learn Data Engineering from scratch
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published
Uh oh!
There was an error while loading.Please reload this page.
Contributors13
Uh oh!
There was an error while loading.Please reload this page.