data-exploration-and-preprocessing
Here are 33 public repositories matching this topic...
Language:All
Sort:Most stars
Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, backup, re-embed (using any model) or access your vector data from any vector databases or repository.
- Updated
Dec 15, 2025 - Jupyter Notebook
Successfully developed a machine learning model which can accurately predict whether a firm will become bankrupt or not, depending on various features such as net value growth rate, borrowing dependency, cash/total assets, etc.
- Updated
Oct 22, 2023 - Jupyter Notebook
Successfully developed a fine-tuned BERT transformer model which can accurately classify symptoms to their corresponding diseases upto an accuracy of 89%.
- Updated
May 6, 2024 - Jupyter Notebook
The Employee Attrition Control project uses data analysis and predictive modeling to understand and address employee turnover. It provides insights and recommendations to reduce attrition and improve employee satisfaction and retention.
- Updated
Jun 16, 2023 - Jupyter Notebook
Successfully developed a fine-tuned DistilBERT transformer model which can accurately predict the overall sentiment of a piece of financial news up to an accuracy of nearly 81.5%.
- Updated
May 6, 2024 - Jupyter Notebook
- Updated
Sep 15, 2024 - Jupyter Notebook
Successfully developed a machine learning model which can accurately predict up to 100% accuracy whether a credit card application of a given applicant would be approved or not, based on several demographic features such as applicant age, total income, marital status, total years of work experience, etc.
- Updated
Oct 27, 2023 - Jupyter Notebook
CHL5230 - Applied Machine Learning for Health Data (Fall 2023) @ University of Toronto: Datathon #1 - Lifestyle and Health Factors
- Updated
Oct 31, 2023 - Jupyter Notebook
Final website for Introduction to Data Science Course. Our project focuses on distribution of public health establishments and pharmacies along Chile.
- Updated
Dec 7, 2025 - HTML
BOM(Bill of Materials) Business Analyst Case Study Solution using Python, Pandas manipulation and Visualization Technique
- Updated
Feb 8, 2025 - Jupyter Notebook
This project consists on exploratory data analysis and the application of supervised learning models for classification using a Hepatocellular Carcinoma dataset. Second Semester of the First Year of the Bachelor's Degree in Bioinformatics at FCUP.
- Updated
May 21, 2024 - Jupyter Notebook
Successfully established a machine learning model which can accurately predict whether an employee of a given company will leave it in the impending future or not, based on several employee details and employment metrics.
- Updated
Oct 15, 2023 - Jupyter Notebook
Assignment Project - Exploring Rental Market Dynamics in Indian Cities using R Programming.
- Updated
Jul 15, 2025 - R
Explored a dataset of planes while learning PySpark commands.
- Updated
Jan 31, 2024 - Jupyter Notebook
Successfully created a machine learning model which can accurately predict the fare of a taxi trip based on several features such as trip duration, tip amount, etc.
- Updated
Oct 26, 2023 - Jupyter Notebook
CHL5230 - Applied Machine Learning for Health Data (Fall 2023) @ University of Toronto: Datathon #5 - mHealth and ML
- Updated
Nov 23, 2023 - Jupyter Notebook
CHL5230 - Applied Machine Learning for Health Data (Fall 2023) @ University of Toronto: Datathon #2 - Early Prediction of Heart Failure
- Updated
Nov 14, 2023 - Jupyter Notebook
Este análisis exploratorio de datos (EDA) realizado sobre el conjunto de datos de rendimiento estudiantil tiene como objetivo identificar y comprender los factores que influyen en el desempeño académico de los estudiantes. A través de la limpieza, transformación y visualización de datos, se busca descubrir patrones y relaciones significatvas.
- Updated
Jan 31, 2025 - Jupyter Notebook
Performed ETL on Retail Sales data an end to end data analytics project using PYTHON and SQL. By using Kaggle API to download the dataset and transformed data, processed data and cleaning using pandas and load the data into Sql Server.
- Updated
Mar 14, 2025 - Jupyter Notebook
A Machine Learning Project implemented from scratch which involves web scraping, data engineering, exploratory data analysis, NLP processing and ML, achieving the functionality of a Content based movie recommender system
- Updated
Feb 10, 2025 - HTML
Improve this page
Add a description, image, and links to thedata-exploration-and-preprocessing topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedata-exploration-and-preprocessing topic, visit your repo's landing page and select "manage topics."