You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
Data analysis project combining Python and SQL to clean, query, and analyze jewelry sales data. Demonstrates skills in data preprocessing with Pandas, writing advanced SQL queries (joins, aggregations, subqueries), and optimizing database performance with indexes and views.
This project focuses on analyzing a jewelry e-commerce dataset to extract meaningful insights about user purchases, product categories, and sales performance. Python (Pandas) is used for data cleaning and preprocessing, while SQL queries handle complex data retrieval, aggregation, joining, and optimization.
🚀 What I Did
Cleaned and preprocessed raw sales data with Python to prepare it for database import
Designed and created relational database tables (users,products, andjewelry_sales)
Performed advanced SQL queries including filtering, grouping, aggregation, and multiple types of JOINs
Used subqueries and created SQL views to simplify and speed up analysis
Optimized query performance by adding indexes on key columns
Extracted key business insights such as total revenue, popular categories, and material-wise sales
🗂️ Dataset Overview
Field Name
Description
event_time
Timestamp of the event
user_id
Unique user identifier
item_id
Unique item identifier
quantity
Quantity of items in the event
product_id
Product identifier
category
Product category (e.g., Ring, Necklace)
is_purchase
Flag indicating purchase (1 = purchase)
price
Price per item
session_id
User session identifier
unknown_flag
Unknown data flag
color
Item color
material
Item material (e.g., Gold, Silver)
gem
Gemstone type
🧹 Data Cleaning (Python)
The dataset was cleaned using Pandas to ensure consistent column names and prepare for further SQL analysis.
Python: Data cleaning and CSV manipulation using PandasSQL: Complex query writing including JOINs, subqueries, grouping, aggregationDatabase Design: Creating normalized tables and views for modular queryingQuery Optimization: Using indexes to speed up database performanceData Analysis: Extracting actionable business insights from sales dataData Visualization: (Optional) Plotting charts with Matplotlib if applicable
About
Data analysis project combining Python and SQL to clean, query, and analyze jewelry sales data. Demonstrates skills in data preprocessing with Pandas, writing advanced SQL queries (joins, aggregations, subqueries), and optimizing database performance with indexes and views.