- Notifications
You must be signed in to change notification settings - Fork7.3k
This is a repo with links to everything you'd ever want to learn about data engineering
DataExpert-io/data-engineer-handbook
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repo has all the resources you need to become an amazing data engineer!
If you are new to data engineering, start by following this2024 breaking into data engineering roadmap
If you are here for the4-week free beginner boot camp you can check out:
If you are here for the6-week free intermediate boot camp you can check out
For more applied learning:
- Check out theprojects section for more hands-on examples!
- Check out theinterviews section for more advice on how to pass data engineering interviews!
- Check out thebooks section for a list of high quality data engineering books
- Check out thecommunities section for a list of high quality data engineering communities to join
- Check out thenewsletter section to learn via email
Top 3 must read books are:
- Fundamentals of Data Engineering
- Designing Data-Intensive Applications
- Designing Machine Learning Systems
Top must-join communities for DE:
Top must-join communities for ML:
- Orchestration
- Data Lake / Cloud
- Data Warehouse
- Data Quality
- Education Companies
- Analytics / Visualization
- Data Integration
- Semantic Layers
- Modern OLAP
- LLM application library
- Real-Time Data
- Data Lineage
- Netflix
- Uber
- Databricks
- Airbnb
- Amazon AWS Blog
- Microsoft Data Architecture Blogs
- Microsoft Fabric Blog
- Oracle
- Meta
- Onehouse
- Estuary Blog
- A Five-Layered Business Intelligence Architecture
- Lakehouse:A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics
- Big Data Quality: A Data Quality Profiling Model
- The Data Lakehouse: Data Warehousing and More
- Spark: Cluster Computing with Working Sets
- The Google File System
- Building a Universal Data Lakehouse
- XTable in Action: Seamless Interoperability in Data Lakes
- MapReduce: Simplified Data Processing on Large Clusters
- Tidy Data
- Data Engineering Whitepapers
Here's the mostly comprehensive list of data engineering creators:(You have to have at least 5k followers somewhere to be added!)
Name | YouTube Channel | Follower Count |
---|---|---|
ByteByteGo | ByteByteGo | 1,000,000+ |
Zach Wilson | Data with Zach | 150,000+ |
Shashank Mishra | E-learning Bridge | 100,000+ |
Seattle Data Guy | Seattle Data Guy | 100,000+ |
TrendyTech | TrendyTech | 100,000+ |
Darshil Parmar | Darshil Parmar | 100,000+ |
Andreas Kretz | Andreas Kretz | 100,000+ |
The Ravit Show | The Ravit Show | 100,000+ |
Guy in a Cube | Guy in a Cube | 100,000+ |
Adam Marczak | Adam Marczak | 100,000+ |
nullQueries | nullQueries | 100,000+ |
TECHTFQ by Thoufiq | TECHTFQ by Thoufiq | 100,000+ |
SQLBI | SQLBI | 100,000+ |
Alex Freberg | Alex The Analyst | 100,000+ |
Ankur Ranjan | Big Data Show | 100,000+ |
Prashanth Kumar Pandey | ScholarNest | 77,000+ |
ITVersity | ITVersity | 67,000+ |
Soumil Shah | Soumil Shah | 50,000 |
Ansh Lamba | Ansh Lamba | 18,000+ |
Azure Lib | Azure Lib | 10,000+ |
Advancing Analytics | Advancing Analytics | 10,000+ |
Kahan Data Solutions | Kahan Data Solutions | 10,000+ |
Ankit Bansal | Ankit Bansal | 10,000+ |
Mr. K Talks Tech | Mr. K Talks Tech | 10,000+ |
Samuel Focht | Python Basics | 10,000+ |
Mehdi Ouazza | Mehdio DataTV | 3,000+ |
Alex Merced | Alex Merced Data | N/A |
John Kutay | John Kutay | N/A |
Emil Kaminski | Databricks For Professionals | 5,000+ |
Name | LinkedIn Profile | Follower Count |
---|---|---|
Zach Wilson | Zach Wilson | 400,000+ |
Chip Huyen | Chip Huyen | 250,000+ |
Shashank Mishra | Shashank Mishra | 100,000+ |
Seattle Data Guy | Ben Rogojan | 100,000+ |
TrendyTech | Sumit Mittal | 100,000+ |
Darshil Parmar | Darshil Parmar | 100,000+ |
Andreas Kretz | Andreas Kretz | 100,000+ |
ByteByteGo (Alex Xu) | Alex Xu | 100,000+ |
Azure Lib (Deepak Goyal) | Deepak Goyal | 100,000+ |
Alex Freberg | Alex Freberg | 100,000+ |
SQLBI (Marco Russo) | Marco Russo | 50,000+ |
Ankit Bansal | Ankit Bansal | 50,000+ |
Marc Lamberti | Marc Lamberti | 50,000+ |
Ankur Ranjan | Ankur Ranjan | 48,000+ |
ITVersity (Durga Gadiraju) | Durga Gadiraju | 48,000+ |
Prashanth Kumar Pandey | Prashanth Kumar Pandey | 37,000+ |
Alex Merced | Alex Merced | 30,000+ |
Ijaz Ali | Ijaz Ali | 24,000+ |
Mehdi Ouazza | Mehdi Ouazza | 20,000+ |
Ananth Packkildurai | Ananth Packkildurai | 18,000+ |
Ansh Lamba | Ansh Lamba | 13,000+ |
Manojkumar Vadivel | Manojkumar Vadivel | 12,000+ |
Advancing Analytics | Simon Whiteley | 10,000+ |
Li Yin | Li Yin | 10,000+ |
Jaco van Gelder | Jaco van Gelder | 10,000+ |
Joseph Machado | Joseph Machado | 10,000+ |
Eric Roby | Eric Roby | 10,000+ |
Simon Späti | Simon Späti | 10,000+ |
Constantin Lungu | Constantin Lungu | 10,000+ |
Lakshmi Sontenam | Lakshmi Sontenam | 9,500+ |
Dani Pálma | Daniel Pálma | 9,000+ |
Soumil Shah | Soumil Shah | 8,000+ |
Arnaud Milleker | Arnaud Milleker | 7,000+ |
Dimitri Visnadi | Dimitri Visnadi | 7,000+ |
Lenny | Lenny A | 6,000+ |
Dipankar Mazumdar | Dipankar Mazumdar | 5,000+ |
Daniel Ciocirlan | Daniel Ciocirlan | 5,000+ |
Hugo Lu | Hugo Lu | 5,000+ |
Tobias Macey | Tobias Macey | 5,000+ |
Marcos Ortiz | Marcos Ortiz | 5,000+ |
Julien Hurault | Julien Hurault | 5,000+ |
John Kutay | John Kutay | 5,000+ |
Hassaan Akbar | Hassaan Akbar | 5,000+ |
Subhankar | Subhankar | 5,000+ |
Nitin | Nitin | N/A |
Hassaan | Hassaan | 5000+ |
Javier de la Torre | Javier | 5000+ |
Name | X/Twitter Profile | Follower Count |
---|---|---|
ByteByteGo | alexxubyte | 100,000+ |
Dan Kornas | @dankornas | 66,000+ |
Zach Wilson | EcZachly | 30,000+ |
Seattle Data Guy | SeattleDataGuy | 10,000+ |
SQLBI | marcorus | 10,000+ |
Joseph Machado | startdataeng | 5,000+ |
Alex Merced | @amdatalakehouse | N/A |
John Kutay | @JohnKutay | N/A |
Mehdi Ouazza | mehd_io | N/A |
Name | Instagram Profile | Follower Count |
---|---|---|
Sundas Khalid | sundaskhalidd | 300,000+ |
Zach Wilson | eczachly | 150,000+ |
Andreas Kretz | learndataengineering | 5,000+ |
Alex Merced | @alexmercedcoder | N/A |
Name | TikTok Profile | Follower Count |
---|---|---|
Zach Wilson | @eczachly | 70,000+ |
Alex Freberg | @alex_the_analyst | 10,000+ |
Mehdi Ouazza | @mehdio_datatv | N/A |
- The Data Engineering Show
- Data Engineering Podcast
- DataTopics
- The Data Engineering Side Of Data
- DataWare
- The Data Coffee Break Podcast
- The Datastack show
- Intricity101 Data Sharks Podcast
- Drill to Detail with Mark Rittman
- Analytics Power Hour
- Catalog & cocktails
- Datatalks
- Data Brew by Databricks
- The Data Cloud Podcast by Snowflake
- What's New in Data
- Open||Source||Data by Datastax
- Streaming Audio by confluent
- The Data Scientist Show
- MLOps.community
- Monday Morning Data Chat
- The Data Chief
- The Joe Reis Show
- Data Bytes
- Super Data Science: ML & AI Podcast with Jon Krohn
Top must follow newsletters for data engineering:
- Data Engineering Vault
- Airbyte Data Glossary
- Data Engineering Wiki by Reddit
- Seconda Glossary
- Glossary Databricks
- Airtable Glossary
- Data Engineering Glossary by Dagster
- DataExpert.io course use codeHANDBOOK10 for a discount!
- LearnDataEngineering.com
- Technical Freelancer Academy Use codezwtech for a discount!
- IBM Data Engineering for Everyone
- Qwiklabs
- DataCamp
- Udemy Courses from Shruti Mantri
- Rock the JVM teaches Spark (in Scala), Flink and others
- Data Engineering Zoomcamp by DataTalksClub
- Efficient Data Processing in Spark
- Scaler
- DataTeams - Data Engingeer hiring platform
- Udemy Courses from Daniel Blanco
- DeepLearning.AI Data Engineering Professional Certificate
- Google Cloud Certified - Professional Data Engineer
- Databricks - Certified Associate Developer for Apache Spark
- Databricks - Data Engineer Associate
- Databricks - Data Engineer Professional
- Microsoft DP-203: Data Engineering on Microsoft Azure
- Microsoft DP-600: Fabric Analytics Engineer Associate
- Microsoft DP-700: Fabric Data Engineer Associate
- AWS Certified Data Engineer - Associate
About
This is a repo with links to everything you'd ever want to learn about data engineering
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.