forked fromDataExpert-io/data-engineer-handbook
- Notifications
You must be signed in to change notification settings - Fork0
This is a repo with links to everything you'd ever want to learn about data engineering
NotificationsYou must be signed in to change notification settings
duyhuynhdev/data-engineer-handbook
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This repo has all the resources you need to become an amazing data engineer!
Make sure to check out theprojects section for more hands-on examples!
Make sure to check out theinterviews section for more advice on how to pass data engineering interviews!
Great books:
- The Fundamentals of Data Engineering
- Designing Data-Intensive Applications
- Designing Machine Learning Systems
- The Hundred Page Machine Learning Book
- Kimball - The Data Warehouse Toolkit
- Data Mesh
- Machine Learning System Design Interview
- Streaming Systems
- High Performance Spark
- Building Evolutionary Architectures, 2nd Edition
- Data Management at Scale, 2nd Edition
- Deciphering Data Architectures
- 97 Things Every Data Engineer Should Know: Collective Wisdom from the Experts
- Data Governance: The Definitive Guide
- Delta Lake: The Definitive Guide
- Hadoop: The Definitive Guide
- Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications
- Data Engineering with dbt: A practical guide to building a dependable data platform with SQL
Communities:
- Seattle Data Guy Discord
- EcZachly Data Engineering Discord
- Chip Huyen MLOps Discord
- Data Engineer Things Slack
- DBT Community
- r/dataengineering
- Microsoft Fabric Community
- r/MicrosoftFabric
- Data Talks Club Slack
- SylphAI for data professional matchmaking
Companies:
- Tabular
- Starburst
- Preset
- Astronomer
- Mage
- Dagster
- Prefect
- AlgoExpert
- ByteByteGo
- Databricks
- Spark
- dbt
- Cube
- Airbyte
- Microsoft
- Snowflake
- Onehouse
Data Engineering blogs of companies:
- Netflix
- Uber
- Databricks
- Airbnb
- Amazon AWS Blog
- Microsoft Data Architecture Blogs
- Microsoft Fabric Blog
- Oracle
- Meta
Data Engineering Whitepapers:
- A Five-Layered Business Intelligence Architecture
- Lakehouse:A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics
- Big Data Quality: A Data Quality Profiling Model
- The Data Lakehouse: Data Warehousing and More
Great YouTube Channels:
- Data with Zach
- Seattle Data Guy
- TrendyTech
- E-learning Bridge
- Darshil Parmar
- Andreas Kretz
- ByteByteGo
- The Ravit Show
- Azure Lib
- Eric Roby
- Guy in a Cube
- Advancing Analytics
- Adam Marczak
- nullQueries
- Kahan Data Solutions
- Ankit Bansal
- TECHTFQ by Thoufiq
Great Podcasts
- The Data Engineering Show
- Data Engineering Podcast
- DataTopics
- The Data Engineering Side Of Data
- DataWare
- The Data Coffee Break Podcast
- Thd datastack show
- Intricity101 Data Sharks Podcast
- Drill to Detail with Mark Rittman
- Analytics Power Hour
- Catalog & cocktails
- Datatalks
- Data Brew by Databricks
- The Data Cloud Podcast by Snowflake
- What's New in data
- Open||Source||Data by Datastax
- Streaming Audio by confluent
- The Data Scientist Show
- MLOps.community
Newsletters:
- DataEngineer.io Newsletter
- Seattle Data Guy
- Joe Reis
- Data Engineering Weekly
- Data Engineering Central
- Dutch Engineer
- ByteByteGo
- Start Data Engineering
- Developing Dev
- High Growth Engineer
- Learn Analytics Engineering
- Marvelous MLOps
- medium Data Engineering Newsletter
- Benn Stancil
- Metadata Weekly
- Technically
- Blef.fr Data News
- All Hands on Data
- Modern Data 101
- Zach Wilson
- Ben Rogojan
- Joe Reis
- Sumit Mittal
- Shashank Mishra
- Darshil Parmar
- Joseph Machado
- Chip Huyen
- Alex Xu
- Deepak Goyal
- Eric Roby
- Andreas Kretz
- Tobias Macey
- Shruti Mantri
- Hugo Lu
- Daniel Ciocirlan
- Marc Lamberti
- Simon Whiteley
- Dipankar Mazumdar
Twitter / X
- Zach Wilson
- Seattle Data Guy
- Sumit Mittal
- Joseph Machado
- Alex Xu
- Eric Roby
- Andreas Kretz
- Marc Lamberti
- Dipankar Mazumdar
- Start Data Engineering
- Data Cyborg
TikTok
Design Patterns
Courses / Academies
- DataEngineer.io Bootcamp/course use codeHANDBOOK10 for a discount!
- LearnDataEngineering.com
- Technical Freelancer Academy Use codezwtech for a discount!
- IBM Data Engineering for Everyone
- Qwiklabs
- DataCamp
- Udemy Courses from Shruti Mantri
- Rock the JVM teaches Spark (in Scala), Flink and others
- Data Engineering Zoomcamp by DataTalksClub
Certifications Courses
- Google Cloud Certified - Professional Data Engineer
- Databricks - Data Engineer Professional
- Azure Data Engineer Associate
- Microsoft Fabric Analytics Engineer Associate
- Exam DP-203: Data Engineering on Microsoft Azure
- AWS Certified Data Engineer - Associate
Conferences
About
This is a repo with links to everything you'd ever want to learn about data engineering
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published