- Notifications
You must be signed in to change notification settings - Fork0
KevinAbrahamRepo/Data-Analytics-Projects
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Included in this repo are some interesting data manipulation and modelling projects that I worked on over the last few months. All analysis was performed inPython 3 (Jupyter Notebook). Below is a brief introduction to each of the projects included.
For more information on the individual projects including some interesting finds during exploratory analysis, please go into the sub-folders. Also looking to improve existing code and extend current functionality so if anyone has got interesting ideas or suggestions for future work, please do let me know!
Analysis on United Kingdoms road safety and traffic demographics dataset obtained fromUK Traffic Dataset - Kaggle with the following key goals:
- Identify common factors responsible for higher accident rates through various feature engineering techniques
- Carry out a restrospective study of the historical dataset and perform descriptive analysis (Tableau, Power BI and Excel Power Pivot)
- Attempt to correct an imbalanced target class (SMOTE, Cluster Centroid, Tomek Links)
- Perform hyper-paramter tuning usingGridsearchCV (scikit-learn python package) to enhance predictive power of several supervised learning models (KNN, SVM, Naive Bayes, Logistic Regression, Random Forest, Gradient Boost - Scikit-learn)
Analyze several thousand tweets collected usingTwitters Streaming API inJSON format to perform sentiment analysis and classify them into sub categories for a more general consensus. The topic for this NLP project was the 106th#Greycup/#greycup held in Edmonton in November, 2018. Key analytic goals:
- Perform a clean data pull from Twitter and transform data for analysis in python (Tweepy)
- Various descriptive and time series analysis for insights (matplotlib (Basemap), Mapboxgl)
- Build predictive models to classify sentiment of a tweet (Naive Bayes, SVM - Linear/Polynomial)
About
Big data manipulation and modelling projects
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.