KevinAbrahamRepo/Data-Analytics-ProjectsPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star1

Big data manipulation and modelling projects

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
Supervised Models		Supervised Models
README.md		README.md

Repository files navigation

Data-Analytics-Projects

Included in this repo are some interesting data manipulation and modelling projects that I worked on over the last few months. All analysis was performed inPython 3 (Jupyter Notebook). Below is a brief introduction to each of the projects included.

For more information on the individual projects including some interesting finds during exploratory analysis, please go into the sub-folders. Also looking to improve existing code and extend current functionality so if anyone has got interesting ideas or suggestions for future work, please do let me know!

Projects using Supervised Learning Models:

Analysis on United Kingdoms road safety and traffic demographics dataset obtained fromUK Traffic Dataset - Kaggle with the following key goals:
- Identify common factors responsible for higher accident rates through various feature engineering techniques
- Carry out a restrospective study of the historical dataset and perform descriptive analysis (Tableau, Power BI and Excel Power Pivot)
- Attempt to correct an imbalanced target class (SMOTE, Cluster Centroid, Tomek Links)
- Perform hyper-paramter tuning usingGridsearchCV (scikit-learn python package) to enhance predictive power of several supervised learning models (KNN, SVM, Naive Bayes, Logistic Regression, Random Forest, Gradient Boost - Scikit-learn)
Analyze several thousand tweets collected usingTwitters Streaming API inJSON format to perform sentiment analysis and classify them into sub categories for a more general consensus. The topic for this NLP project was the 106th#Greycup/#greycup held in Edmonton in November, 2018. Key analytic goals:
- Perform a clean data pull from Twitter and transform data for analysis in python (Tweepy)
- Various descriptive and time series analysis for insights (matplotlib (Basemap), Mapboxgl)
- Build predictive models to classify sentiment of a tweet (Naive Bayes, SVM - Linear/Polynomial)

About

Big data manipulation and modelling projects

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Data-Analytics-Projects

Projects using Supervised Learning Models:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

KevinAbrahamRepo/Data-Analytics-Projects

Folders and files

Latest commit

History

Repository files navigation

Data-Analytics-Projects

Projects using Supervised Learning Models:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages