Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A curated list of free courses from reputable universities that meet the requirements of an undergraduate curriculum in Data Science, excluding general education. With projects, supporting materials in an organized structure.

NotificationsYou must be signed in to change notification settings

marcoshsq/Data_Science_Roadmap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 

Repository files navigation

Data Science Banner

📊 Data Science & Analytics Self-Taught Curriculum

A structured roadmap to learn data science for free


🧭 Summary


🧠 About

ThisSelf-Taught Data Science Curriculum is a structured roadmap I designed to guide my own journey into data science — using onlyfree and high-quality online resources.

My goal was to create a program that covered the full stack of data science knowledge, from basic concepts to advanced applications, including:

  • Programming
  • Mathematics and Statistics
  • Machine Learning and Deep Learning
  • Databases, Big Data, and Cloud Computing

This roadmap is ideal for:

  • Aspiring data scientists learning independently
  • Professionals who want todeepen their analytics skills
  • Anyone seeking a structured, free alternative to paid bootcamps

📌 This is a living document — I update it regularly as I complete courses or discover better resources.


🎯 Learning Goals

This curriculum is designed to help you gain practical, theoretical, and technical proficiency across key areas of data science:

1️⃣ Programming for Data Science

  • Python: Data manipulation, visualization, ML tools

2️⃣ Mathematics & Statistics

  • Linear Algebra, Calculus, Probability
  • Inferential Statistics, Bayesian Methods, Regression
  • ML theory and algorithmic foundations

3️⃣ Databases, Warehousing, and Big Data

  • SQL & NoSQL
  • Data lakes, pipelines, and ETL
  • Tools like Hadoop, Spark, and cloud-native storage

4️⃣ Machine Learning & Deep Learning

  • Supervised & Unsupervised Learning
  • Neural Networks, CNNs, NLP, RL
  • AI ethics and responsible modeling

🗂️ Curriculum Overview

The curriculum is structured into10 sections, grouped by learning stage and topic. Each includes carefully selected resources with estimated time commitment.

SectionAreaApprox. Hours
01Fundamentals~40h
02Mathematics & Statistics~90h
03Programming (Python)~215h
04Data Mining~120h
05Databases & SQL~80h
06Big Data~85h
07Machine Learning~120h
08Deep Learning~125h
09Data Warehousing~300h
10Cloud Computing~120h

🧩 Detailed tables with links, skills, and certificates are available in each section.


📌 How to Use This Curriculum

This roadmap is flexible and can be adapted based on your learning pace and background:

  • ✅ Follow itsequentially if you're starting from scratch.
  • ✅ Skip sections if you already have knowledge in a particular area.
  • ✅ Combine different resources, projects, and additional readings.

Each module contains curated courses with estimated effort and certification options when available.

📚 Section 01 - Fundamentals (~40h)

In this first section, my goal is to establish a solid foundation in data science by understanding the role of data in decision-making, the fundamentals of the field, and the key tools used by professionals. Additionally, I aim to develop a clear understanding of what it means to be a data scientist, the essential skills required, and how to apply this knowledge in practice.

The main skills I want to acquire in this stage include:

  • ✅ Understanding what data is and how it can be used
  • ✅ Fundamental concepts of data science and its impact on various industries
  • ✅ Familiarity with essential tools for data analysis and manipulation

Courses

📚Data – What It Is, What We Can Do With It (Johns Hopkins University)

This course provides a clear introduction to what data is, how it is generated, and how it can be used to answer questions and support decision-making. I chose this course to build a conceptual foundation before moving on to more complex techniques.

Skills developed:

  • Understanding the concept of data and its different forms
  • Practical applications of data usage in problem-solving
  • Introduction to data collection, organization, and interpretation
CourseOffered byEffort
Data – What It Is, What We Can Do With ItJohns Hopkins University~11h

📚 What is Data Science? (IBM Skills Network)

This course offers an overview of the field of data science, exploring the responsibilities of a data scientist, the stages of the data analysis process, and its applications. It helps to better understand the career and the importance of data science in the modern world.

Skills developed:

  • Understanding what data science is and its applications
  • Insights into the data science lifecycle
  • Knowledge of the key tools and technologies used in the field
CourseOffered byEffort
What is Data Science?IBM Skills Network~11h

📚The Data Scientist's Toolbox (Johns Hopkins University)

This course is essential for gaining familiarity with the fundamental tools used by data scientists. It introduces basic programming concepts, version control, and project organization—essential elements for working with data in a structured and efficient way.

Skills developed:

  • Introduction to R and RStudio
  • Basic concepts of Git and GitHub for version control
  • Insights into data science workflows
CourseOffered byEffort
The Data Scientist's ToolboxJohns Hopkins University~18h

📐 Section 02 - Mathematics and Statistics for Data Science (~90h)

This section is essential to understand themathematical and statistical foundations of data science. My goal here is to acquire strong theoretical tools to support more complex models, especially in machine learning and inferential analysis.

The content covers linear algebra, calculus, probability, and statistics — from basic concepts to the application of Bayesian methods and ML theory.

The main skills I want to develop at this stage include:

  • ✅ Understanding matrix operations, vectors, eigenvalues, and decompositions
  • ✅ Derivatives, gradients, and optimization for ML
  • ✅ Concepts of probability, distributions, and statistical inference
  • ✅ Bayesian thinking and probabilistic reasoning
  • ✅ Theoretical foundation behind supervised and unsupervised models

Courses

📚Linear Algebra for Machine Learning and Data Science – DeepLearning.AI

Description:This course offers an applied and visual introduction to linear algebra — one of the most crucial areas for working with data. It explores matrices, vector spaces, linear transformations, and the math behind dimensionality reduction and neural networks.

Why I chose this course:It’s part of theMathematics for Machine Learning and Data Science specialization by DeepLearning.AI, created with Andrew Ng’s endorsement. The practical approach with visual tools makes it ideal for learners in applied data science.

Skills developed:

  • Matrix operations, vector norms, and projections
  • Singular Value Decomposition (SVD)
  • Applications in data compression and feature extraction
CourseOffered byEffort
Linear Algebra for ML and DSDeepLearning.AI~34h

📚Calculus for Machine Learning and Data Science – DeepLearning.AI

Description:This course provides a practical introduction to calculus, focusing on how derivatives and integrals are used to train and optimize machine learning models.

Why I chose this course:Traditional calculus courses are very theoretical. This one, however, is laser-focused on real applications like gradient descent, cost functions, and model convergence — essential concepts for data scientists.

Skills developed:

  • Derivatives and chain rule
  • Optimization using gradients
  • Applications of calculus in ML training
CourseOffered byEffort
Calculus for ML and DSDeepLearning.AI~25h

📚Probability and Statistics for Machine Learning and Data Science – DeepLearning.AI

Description:A comprehensive course that introduces both descriptive and inferential statistics, with a focus on applications in machine learning. Topics include probability theory, conditional probability, hypothesis testing, and Bayesian methods.

Why I chose this course:This course doesn't just teach "classical stats" — it explicitly bridges the gap between statistics and ML, making it perfect for applied work in data science.

Skills developed:

  • Descriptive statistics and probability distributions
  • Conditional probability and Bayes' Theorem
  • Confidence intervals and hypothesis testing
  • Probabilistic thinking in ML
CourseOffered byEffort
Probability & Stats for MLDeepLearning.AI~33h

🐍 Section 03-A - Python Language for Data Analysis (~140h)

In this section, my goal is to master Python as the primary language for data analysis, visualization, and machine learning. Python is the industry standard in data science, widely adopted thanks to its simplicity, community, and rich ecosystem of libraries like NumPy, pandas, matplotlib, scikit-learn, and many others.

The focus here is onhands-on experience, building the ability to:

  • Write clean, efficient code for data manipulation
  • Use Python tools to explore, visualize, and analyze data
  • Implement and evaluate machine learning models
  • Work with real datasets, pipelines, and applied problems

The main skills I want to develop at this stage include:

  • ✅ Python programming for data manipulation (NumPy, pandas)
  • ✅ Data visualization using matplotlib and seaborn
  • ✅ Building and validating machine learning models with scikit-learn
  • ✅ Natural language processing and social network analysis
  • ✅ Applying Python in real-world projects across different domains

Courses

🐍Introduction to Data Science in Python – University of Michigan

Description:A foundational course that introduces data manipulation with pandas, working with DataFrames, and the basics of cleaning and transforming data for analysis.

Why I chose this course:It’s the first course of theApplied Data Science with Python Specialization, one of the most respected Python tracks on Coursera. It provides a smooth learning curve for practical data tasks.

Skills developed:

  • Data structures in pandas
  • Handling missing values and data types
  • Basic exploratory data analysis (EDA)
CourseOffered byEffort
Intro to Data Science in PythonUniv. of Michigan~34h

🐍Applied Plotting, Charting & Data Representation in Python – University of Michigan

Description:This course introduces practical techniques to create visualizations using matplotlib and other Python libraries, focusing on choosing the right type of plot for different data contexts.

Why I chose this course:Data visualization is often underestimated, but it’s critical for communicating insights. This course strengthens your ability to create professional, informative visuals.

Skills developed:

  • Line plots, histograms, scatterplots, and advanced charts
  • Visual perception principles
  • Interactive plotting and dashboard elements
CourseOffered byEffort
Plotting in PythonUniv. of Michigan~24h

🐍Applied Machine Learning in Python – University of Michigan

Description:Focuses on implementing machine learning models using scikit-learn, including classification, regression, and clustering.

Why I chose this course:It emphasizes not just the use of models but also best practices like train/test splits, model evaluation, overfitting, and performance metrics — all essential for a solid ML foundation.

Skills developed:

  • scikit-learn pipelines
  • Supervised learning (logistic regression, decision trees)
  • Model evaluation and validation techniques
CourseOffered byEffort
ML in PythonUniv. of Michigan~31h

🐍Applied Text Mining in Python – University of Michigan

Description:Covers the fundamentals of natural language processing (NLP) in Python, including tokenization, TF-IDF, and basic text classification.

Why I chose this course:Text data is everywhere — and this course provides the essential tools to process and analyze it using real-world datasets.

Skills developed:

  • Working with text data using pandas and NLTK
  • Document-term matrices
  • Basic text classifiers (e.g., Naive Bayes)
CourseOffered byEffort
Text Mining in PythonUniv. of Michigan~25h

🐍Applied Social Network Analysis in Python – University of Michigan

Description:Explores how to analyze network structures such as social graphs, user connections, and centrality using NetworkX and Python.

Why I chose this course:Social network analysis is increasingly useful in marketing, user behavior, fraud detection, and influence modeling.

Skills developed:

  • Graph theory and network metrics
  • Using NetworkX for social graphs
  • Identifying influential nodes and clusters
CourseOffered byEffort
Social Network AnalysisUniv. of Michigan~26h

🧪 Section 04 - Data Mining (~120h)

In this section, I explore how to extract useful patterns, knowledge, and structures from large volumes of data — both structured and unstructured. The focus is onpractical techniques in text mining, clustering, and pattern discovery, with applications in business intelligence, recommendation systems, and behavioral analysis.

The goals here are to:

  • ✅ Understand fundamental data mining concepts
  • ✅ Learn how to extract insights from text data
  • ✅ Apply clustering and pattern recognition algorithms
  • ✅ Improve decision-making with data visualization

All the courses come from theData Mining Specialization by the University of Illinois Urbana-Champaign.


🧪Data Visualization – University of Illinois

Description:
This course covers the essentials of visualizing data effectively — not just creating pretty charts, but telling meaningful stories through data. It introduces principles of design, perception, and interpretation.

Why I chose this course:
Understanding how tocommunicate insights visually is just as important as the analysis itself. This course emphasizes design thinking and good visualization practices.

Skills developed:

  • Best practices for chart selection and design
  • Use of color, layout, and perception in data storytelling
  • Hands-on experience building visualizations
CourseOffered byEffort
Data VisualizationUniversity of Illinois~15h

🔎Text Retrieval and Search Engines – University of Illinois

Description:
Explores how modern search engines work, including indexing, ranking, and retrieval of large text collections. Introduces TF-IDF, inverted indexes, and Boolean models.

Why I chose this course:
It offers a solid foundation forbuilding search systems and working withlarge-scale text data — crucial for recommendation systems, search platforms, and NLP.

Skills developed:

  • Document indexing and search algorithms
  • TF-IDF and cosine similarity
  • Evaluation of retrieval performance (precision, recall)
CourseOffered byEffort
Text Retrieval and Search EnginesUniversity of Illinois~30h

🧾Text Mining and Analytics – University of Illinois

Description:
Delves into the mining of unstructured text data, covering key topics like topic modeling, sentiment analysis, and named entity recognition.

Why I chose this course:
Text is one of the most abundant data formats today. This course buildsNLP fundamentals that are crucial for applications in marketing, product reviews, and social media analysis.

Skills developed:

  • Text preprocessing and feature engineering
  • Topic modeling (e.g., LDA)
  • Sentiment analysis and classification
CourseOffered byEffort
Text Mining and AnalyticsUniversity of Illinois~33h

🔁Pattern Discovery in Data Mining – University of Illinois

Description:
Focuses on algorithms to discover frequent patterns, associations, and sequences in datasets. Introduces the Apriori algorithm and association rule mining.

Why I chose this course:
It’s essential formarket basket analysis,fraud detection, andbehavior prediction, giving insights intorepetitive and meaningful patterns.

Skills developed:

  • Frequent itemset mining (Apriori, FP-Growth)
  • Association rules (support, confidence, lift)
  • Sequential pattern mining
CourseOffered byEffort
Pattern Discovery in Data MiningUniversity of Illinois~17h

🔗Cluster Analysis in Data Mining – University of Illinois

Description:
Introduces unsupervised learning techniques to group similar items without labeled outcomes. Covers clustering metrics, methods, and applications.

Why I chose this course:
Clustering is a powerful tool forcustomer segmentation,anomaly detection, andunsupervised exploration of datasets.

Skills developed:

  • k-means and hierarchical clustering
  • Density-based clustering (DBSCAN)
  • Cluster evaluation and visualization
CourseOffered byEffort
Cluster Analysis in Data MiningUniversity of Illinois~16h

🗄️ Section 05 - Databases and SQL (~80h)

This section focuses on mastering relational databases and SQL, the backbone of storing and querying structured data. Understanding how databases work — from design principles to advanced querying — is essential for any data analyst or data scientist.

Main goals in this section:

  • ✅ Learn how to design and normalize relational databases
  • ✅ Query data efficiently using SQL
  • ✅ Understand advanced database topics, including emerging technologies
  • ✅ Build a solid foundation for data warehousing and backend data engineering

All courses are part of theDatabases for Data Scientists Specialization by the University of Colorado.


🗄️Relational Database Design – University of Colorado

Description:
Introduces the foundations of relational databases, including normalization, entity-relationship modeling, and schema design for structured data storage.

Why I chose this course:
Before writing any SQL, it’s essential to understandhow databases are structured and why proper design ensures data integrity and performance.

Skills developed:

  • Entity-Relationship (ER) modeling
  • Normalization (1NF to 3NF)
  • Schema creation and database logic
CourseOffered byEffort
Relational Database DesignUniversity of Colorado~34h

🧾The Structured Query Language (SQL) – University of Colorado

Description:
A hands-on introduction to SQL, covering SELECT statements, joins, subqueries, filtering, aggregation, and working with multiple tables.

Why I chose this course:
SQL is amust-have skill for data professionals. This course reinforces the fundamentals while also preparing for complex queries and real-world use cases.

Skills developed:

  • SELECT, WHERE, GROUP BY, and JOIN clauses
  • Writing nested queries and subqueries
  • Filtering, sorting, and aggregating data
CourseOffered byEffort
The Structured Query Language (SQL)University of Colorado~26h

🚀Advanced Topics and Future Trends in Database Technologies – University of Colorado

Description:
Covers cutting-edge and emerging database topics such as NoSQL, NewSQL, distributed databases, and database scalability.

Why I chose this course:
As data ecosystems evolve, it’s important to understandwhere database technology is heading — especially with big data, real-time systems, and cloud-native tools.

Skills developed:

  • Concepts of NoSQL, document, key-value, and columnar stores
  • Distributed database systems and CAP theorem
  • Emerging trends: scalability, cloud databases, and database-as-a-service
CourseOffered byEffort
Advanced Topics and Future Trends in Database TechnologiesUniversity of Colorado~16h

🧱 Section 06 - Big Data (~85h)

This section introduces the architecture, tools, and methods used to work withmassive volumes of data that exceed the capabilities of traditional systems. The courses cover everything fromdata storage and integration todistributed processing andmachine learning at scale.

Key goals for this section:

  • ✅ Understand the foundations of big data systems and architectures
  • ✅ Explore tools for storing, querying, and integrating large datasets
  • ✅ Learn scalable machine learning techniques
  • ✅ Apply graph analytics to uncover relationships in complex data

All courses come from theBig Data Specialization by the University of California, San Diego.


🧱Introduction to Big Data – University of California

Description:
A high-level overview of what big data is, why it matters, and how it’s transforming business and research. Covers the big data ecosystem, including Hadoop and NoSQL.

Why I chose this course:
It provides a clearintroductory framework for the concepts, challenges, and technologies of working with large-scale data.

Skills developed:

  • Definitions and scope of big data
  • Overview of the Hadoop ecosystem
  • Real-world applications and case studies
CourseOffered byEffort
Introduction to Big DataUniversity of California~17h

🗃️Big Data Modeling and Management Systems – University of California

Description:
Covers how to structure and organize data in distributed systems, including NoSQL databases like HBase, Cassandra, and MongoDB.

Why I chose this course:
To understand thedifferent paradigms of data storage and how schema design affects performance and scalability.

Skills developed:

  • Data modeling in big data environments
  • NoSQL systems: document, columnar, and key-value stores
  • Data consistency and availability trade-offs
CourseOffered byEffort
Big Data Modeling and Management SystemsUniversity of California~13h

🔄Big Data Integration and Processing – University of California

Description:
Focuses on data ingestion and transformation at scale, using Apache Spark, MapReduce, and ETL pipelines for distributed processing.

Why I chose this course:
Efficient processing is key in big data — this course buildshands-on skills for integrating and transforming large datasets.

Skills developed:

  • Distributed data processing (Spark, MapReduce)
  • ETL and data integration pipelines
  • Batch vs. stream processing
CourseOffered byEffort
Big Data Integration and ProcessingUniversity of California~17h

🤖Machine Learning with Big Data – University of California

Description:
Teaches how to build and scale machine learning models using tools like Apache Spark’s MLlib, focusing on classification, clustering, and recommendation systems.

Why I chose this course:
It connectsmachine learning theory withbig data tools, which is essential for working in real-world production environments.

Skills developed:

  • Scalable machine learning with Spark MLlib
  • Model training and evaluation in distributed systems
  • Feature engineering at scale
CourseOffered byEffort
Machine Learning with Big DataUniversity of California~23h

🌐Graph Analytics for Big Data – University of California

Description:
Explores how to analyze relationships in large graphs, such as social networks or web link structures, using graph theory and distributed algorithms.

Why I chose this course:
Graph analytics is a powerful approach forunderstanding structure and influence in connected datasets — from fraud detection to recommendation systems.

Skills developed:

  • Graph modeling and structure
  • Graph traversal and centrality
  • Distributed graph processing (e.g., GraphX)
CourseOffered byEffort
Graph Analytics for Big DataUniversity of California~13h

🤖 Section 07 - Machine Learning (~100h)

This section builds the foundation for understanding and applyingmachine learning algorithms, from basic regression to advanced techniques like ensemble learning, recommendation systems, and reinforcement learning.

The focus is on bothconceptual understanding andhands-on implementation, using real-world datasets to develop practical, production-ready ML pipelines.

Main goals for this section:

  • ✅ Master core ML algorithms (supervised and unsupervised)
  • ✅ Build and evaluate models using regression, classification, and clustering
  • ✅ Understand trade-offs in model complexity, bias, and variance
  • ✅ Explore recommender systems and reinforcement learning techniques

All courses are part of theMachine Learning Specialization by DeepLearning.AI, taught by Andrew Ng.


📈Supervised Machine Learning: Regression and Classification – DeepLearning.AI

Description:
This course introduces the most fundamental machine learning techniques: linear regression, logistic regression, and decision boundaries — all explained with practical coding examples.

Why I chose this course:
It provides thebest conceptual intro to supervised learning, with hands-on notebooks and real-world exercises. Andrew Ng’s teaching style makes even complex topics accessible.

Skills developed:

  • Linear and logistic regression
  • Gradient descent and loss functions
  • Bias-variance tradeoff and regularization
CourseOffered byEffort
Supervised Machine Learning: Regression and ClassificationDeepLearning.AI~33h

🧠Advanced Learning Algorithms – DeepLearning.AI

Description:
Goes deeper into supervised learning with advanced algorithms such as decision trees, random forests, XGBoost, and support vector machines.

Why I chose this course:
Toexpand beyond linear models and gain confidence in implementing some of the most powerful ML algorithms used in industry.

Skills developed:

  • Decision trees and ensemble methods (Random Forests, XGBoost)
  • SVMs and kernel tricks
  • Model selection and hyperparameter tuning
CourseOffered byEffort
Advanced Machine Learning AlgorithmsDeepLearning.AI~34h

🧩Unsupervised Learning, Recommenders, Reinforcement Learning – DeepLearning.AI

Description:
Covers powerful unsupervised learning techniques such as clustering, anomaly detection, and PCA, along with real-world applications like recommendation systems and Q-learning.

Why I chose this course:
It connectstheory to application, showing how clustering and reinforcement learning power modern platforms — from YouTube recommendations to game AIs.

Skills developed:

  • k-means clustering and anomaly detection
  • Dimensionality reduction (PCA)
  • Recommender systems and collaborative filtering
  • Reinforcement learning and Q-learning
CourseOffered byEffort
Unsupervised Learning, Recommenders, Reinforcement LearningDeepLearning.AI~37h

🧬 Section 08 - Deep Learning (~125h)

This section dives deep intoneural networks and deep learning, the foundation of modern AI systems. It covers a full pipeline from basic neural networks to advanced architectures likeCNNs andRNNs, with an emphasis on practical techniques for building and improving deep models.

Main goals in this section:

  • ✅ Understand the math and mechanics behind deep neural networks
  • ✅ Learn how to tune, train, and optimize deep learning models
  • ✅ Apply deep learning to images, sequences, and NLP tasks
  • ✅ Gain experience with TensorFlow/Keras and real-world use cases

All courses are part of theDeep Learning Specialization by DeepLearning.AI, taught by Andrew Ng.


🧠Neural Networks and Deep Learning – DeepLearning.AI

Description:
Introduces the fundamentals of deep learning, including perceptrons, forward/backpropagation, activation functions, and basic architectures.

Why I chose this course:
It lays thecore theoretical foundation for all deep learning work and presents it in an accessible, structured way.

Skills developed:

  • Basics of deep neural networks
  • Forward and backward propagation
  • Activation functions and weight initialization
CourseOffered byEffort
Neural Networks and Deep LearningDeepLearning.AI~24h

⚙️Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization – DeepLearning.AI

Description:
Covers practical tools for improving deep learning models: optimization strategies, hyperparameter tuning, batch normalization, dropout, and more.

Why I chose this course:
It bridges the gap between theory and practice, offeringhands-on techniques for boosting model performance.

Skills developed:

  • Learning rate decay and mini-batch gradient descent
  • Regularization: L2, dropout
  • Hyperparameter tuning and optimizers (Adam, RMSprop)
CourseOffered byEffort
Improving Deep Neural NetworksDeepLearning.AI~23h

🛠️Structuring Machine Learning Projects – DeepLearning.AI

Description:
Focuses on the mindset and best practices for managing ML projects — how to prioritize errors, build scalable pipelines, and iterate effectively.

Why I chose this course:
It offersstrategic thinking that is often overlooked: how to debug, scale, and manage ML projects in real-world environments.

Skills developed:

  • Error analysis and ceiling analysis
  • Avoiding data leakage
  • Managing train/dev/test splits in production
CourseOffered byEffort
Structuring Machine Learning ProjectsDeepLearning.AI~06h

🧿Convolutional Neural Networks (CNNs) – DeepLearning.AI

Description:
Explores convolutional architectures used in image recognition, detection, and segmentation tasks — including ResNet and YOLO.

Why I chose this course:
CNNs are essential for working with image data — this course gives ahands-on introduction to convolutional layers and computer vision tasks.

Skills developed:

  • Convolutions, pooling, padding
  • Deep CNN architectures (ResNet, Inception)
  • Image classification and object detection
CourseOffered byEffort
Convolutional Neural NetworksDeepLearning.AI~35h

🔁Sequence Models – DeepLearning.AI

Description:
Covers how to build models for sequential data, such as time series or natural language, using RNNs, GRUs, LSTMs, and attention mechanisms.

Why I chose this course:
Sequence models power everything fromchatbots to music generation — and this course gives the tools to implement them.

Skills developed:

  • Recurrent neural networks (RNN, LSTM, GRU)
  • Natural language processing basics
  • Attention and sequence-to-sequence models
CourseOffered byEffort
Sequence ModelsDeepLearning.AI~37h

🏗️ Section 09 - Data Warehousing (~300h)

This section focuses on the architecture and implementation ofdata warehouses andbusiness intelligence systems — critical infrastructure for enterprise analytics. It covers everything from relational database theory to ETL pipelines and BI reporting.

Main goals in this section:

  • ✅ Understand how data warehouses are designed and structured
  • ✅ Learn how to build scalable ETL processes and integrate data from multiple sources
  • ✅ Apply business intelligence tools to extract actionable insights
  • ✅ Prepare for roles in backend analytics, data engineering, and BI architecture

All courses are part of theData Warehousing for Business Intelligence Specialization by the University of Colorado Boulder.


🏗️Database Management Essentials – Colorado Boulder

Description:
Covers relational database foundations: relational algebra, SQL queries, schema design, and data integrity enforcement.

Why I chose this course:
It provides thecore theoretical and technical background needed for understanding how relational databases support analytical workloads.

Skills developed:

  • Relational model, ER modeling, and constraints
  • SQL for data definition and manipulation
  • Foundations for OLAP vs. OLTP systems
CourseOffered byEffort
Database Management EssentialsColorado Boulder~122h

🧱Data Warehouse Concepts, Design, and Data Integration – Colorado Boulder

Description:
Introduces dimensional modeling, star/snowflake schemas, and the processes of integrating data from disparate sources into a central warehouse.

Why I chose this course:
It focuses on thedesign principles behind scalable data warehouses, which are crucial for efficient querying and reporting.

Skills developed:

  • Dimensional data modeling (facts/dimensions)
  • Star, snowflake, and constellation schemas
  • ETL design and implementation
CourseOffered byEffort
Data Warehouse Concepts, Design, and Data IntegrationColorado Boulder~62h

🧮Relational Database Support for Data Warehouses – Colorado Boulder

Description:
Explores how relational systems support warehouse workloads, including indexing, query optimization, and data partitioning.

Why I chose this course:
It connectsrelational database theory with warehousing practice, helping understand performance and scalability challenges.

Skills developed:

  • Query performance tuning
  • Materialized views and indexing strategies
  • Physical schema design for OLAP
CourseOffered byEffort
Relational Database Support for Data WarehousesColorado Boulder~71h

📊Business Intelligence Concepts, Tools, and Applications – Colorado Boulder

Description:
Covers how BI tools are used to extract, visualize, and act on business data — with case studies and practical examples of analytics dashboards.

Why I chose this course:
To connectdata infrastructure to end-user decision-making, focusing on storytelling, KPIs, and dashboards.

Skills developed:

  • BI tool landscape and use cases
  • OLAP operations (roll-up, drill-down)
  • Data-driven decision frameworks
CourseOffered byEffort
Business Intelligence Concepts, Tools, and ApplicationsColorado Boulder~21h

🛠️Design and Build a Data Warehouse for BI Implementation – Colorado Boulder

Description:
A capstone-style course that guides you through designing and implementing a working data warehouse, integrating ETL processes and building reports.

Why I chose this course:
It offershands-on experience that ties together all previous concepts — from schema design to final BI delivery.

Skills developed:

  • Full warehouse architecture lifecycle
  • Data sourcing, transformation, and loading
  • Reporting and BI dashboard implementation
CourseOffered byEffort
Design and Build a Data Warehouse for Business Intelligence ImplementationColorado Boulder~31h

☁️ Section 10 - Cloud Computing (~120h)

This section focuses on thecore principles of cloud computing, including infrastructure, applications, networking, and practical project deployment. It builds a foundational understanding of how cloud systems work and how todesign scalable, distributed applications in the cloud.

Key goals for this section:

  • ✅ Understand cloud infrastructure, virtualization, and scalability
  • ✅ Learn how to design and deploy cloud-native applications
  • ✅ Explore networking, security, and orchestration in the cloud
  • ✅ Complete a practical project simulating real-world deployment

All courses are part of theCloud Computing Specialization by the University of Illinois Urbana-Champaign.


☁️Cloud Concepts Part 1 – University of Illinois

Description:
Introduces the fundamental building blocks of cloud computing, including data centers, virtualization, and service models like IaaS, PaaS, and SaaS.

Why I chose this course:
It builds thefoundational knowledge needed to understand the economics, architecture, and design of modern cloud systems.

Skills developed:

  • Cloud service models and deployment strategies
  • Virtualization and resource allocation
  • Intro to AWS, Google Cloud, and Azure paradigms
CourseOffered byEffort
Cloud Concepts 1University of Illinois~24h

🌩️Cloud Concepts Part 2 – University of Illinois

Description:
Expands on the first course by discussing elasticity, fault tolerance, containers, and scalability strategies in cloud architecture.

Why I chose this course:
It dives deeper intocloud resilience and elasticity, key aspects for high-availability systems.

Skills developed:

  • Containers and microservices
  • Cloud scalability and elasticity
  • Managing reliability and availability
CourseOffered byEffort
Cloud Concepts 2University of Illinois~19h

🧩Cloud Applications Part 1 – University of Illinois

Description:
Focuses on developing cloud-native applications using APIs, data storage services, and managed compute instances.

Why I chose this course:
It introduces thedeveloper’s perspective, teaching how to design and deploy real applications on the cloud.

Skills developed:

  • Cloud APIs and storage models
  • Stateless and stateful application design
  • Handling scale and concurrency
CourseOffered byEffort
Cloud Applications 1University of Illinois~15h

⚙️Cloud Applications Part 2 – University of Illinois

Description:
Continues development topics with a focus on performance, monitoring, container orchestration, and user authentication.

Why I chose this course:
This course emphasizesoperational excellence and monitoring, which are crucial for real-world systems in production.

Skills developed:

  • Logging and monitoring cloud apps
  • Load balancing and caching
  • Authentication and access control
CourseOffered byEffort
Cloud Applications 2University of Illinois~19h

🌐Cloud Networking – University of Illinois

Description:
Covers how networking works in cloud environments, including virtual networks, firewalls, routing, and SDNs.

Why I chose this course:
To understandhow services communicate at scale, securely and efficiently across virtualized infrastructure.

Skills developed:

  • Virtual Private Clouds (VPCs)
  • Network configuration and subnetting
  • Load balancers and security groups
CourseOffered byEffort
Cloud NetworkingUniversity of Illinois~22h

🛠️Cloud Computing Project – University of Illinois

Description:
A hands-on capstone project where you build and deploy a full-stack application in the cloud, integrating all concepts from the specialization.

Why I chose this course:
To apply all concepts in arealistic, end-to-end scenario, simulating a true production deployment pipeline.

Skills developed:

  • App deployment using cloud platforms
  • Integrating storage, compute, and networking
  • Debugging and monitoring a cloud-native app
CourseOffered byEffort
Cloud Computing ProjectUniversity of Illinois~21h

📖 Extra Bibliography

If you're looking for deeper insights, consider these additional resources:

Mathematics

Machine Learning & AI

Programming & Databases

These resources cover a wide range of topics from foundational mathematics and statistical theory to advanced machine learning and artificial intelligence.

📝 Notes and Clarifications

  • Course durations are approximate and based on platform estimates.
  • Some books were accessed through university partnerships, but if you don't have access... well, explore alternative ways. If possible, support authors by purchasing them.
  • The curriculum iscontinuously evolving as new resources become available.

🔗 References

Sources used to structure this curriculum:


Developer Roadmap


About

A curated list of free courses from reputable universities that meet the requirements of an undergraduate curriculum in Data Science, excluding general education. With projects, supporting materials in an organized structure.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp