Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Data analysis project combining Python and SQL to clean, query, and analyze jewelry sales data. Demonstrates skills in data preprocessing with Pandas, writing advanced SQL queries (joins, aggregations, subqueries), and optimizing database performance with indexes and views.

NotificationsYou must be signed in to change notification settings

yuvraj0412s/Jewelry-Sales-Data-Analysis-SQL-Python

Repository files navigation

PythonSQLPandasSQLiteData CleaningMatplotlibStatus


🧠 Project Objective

This project focuses on analyzing a jewelry e-commerce dataset to extract meaningful insights about user purchases, product categories, and sales performance.
Python (Pandas) is used for data cleaning and preprocessing, while SQL queries handle complex data retrieval, aggregation, joining, and optimization.


🚀 What I Did

  • Cleaned and preprocessed raw sales data with Python to prepare it for database import
  • Designed and created relational database tables (users,products, andjewelry_sales)
  • Performed advanced SQL queries including filtering, grouping, aggregation, and multiple types of JOINs
  • Used subqueries and created SQL views to simplify and speed up analysis
  • Optimized query performance by adding indexes on key columns
  • Extracted key business insights such as total revenue, popular categories, and material-wise sales

🗂️ Dataset Overview

Field NameDescription
event_timeTimestamp of the event
user_idUnique user identifier
item_idUnique item identifier
quantityQuantity of items in the event
product_idProduct identifier
categoryProduct category (e.g., Ring, Necklace)
is_purchaseFlag indicating purchase (1 = purchase)
pricePrice per item
session_idUser session identifier
unknown_flagUnknown data flag
colorItem color
materialItem material (e.g., Gold, Silver)
gemGemstone type

🧹 Data Cleaning (Python)

The dataset was cleaned using Pandas to ensure consistent column names and prepare for further SQL analysis.

importpandasaspddf=pd.read_csv("jewelry.csv")df.columns= ["event_time","user_id","item_id","quantity","product_id","category","is_purchase","price","session_id","unknown_flag","color","material","gem"]df.to_csv("jewelry_cleaned.csv",index=False)

🎯 Skills & Tools Demonstrated

Python: Data cleaning and CSV manipulation using PandasSQL: Complex query writing including JOINs, subqueries, grouping, aggregationDatabase Design: Creating normalized tables and views for modular queryingQuery Optimization: Using indexes to speed up database performanceData Analysis: Extracting actionable business insights from sales dataData Visualization: (Optional) Plotting charts with Matplotlib if applicable

About

Data analysis project combining Python and SQL to clean, query, and analyze jewelry sales data. Demonstrates skills in data preprocessing with Pandas, writing advanced SQL queries (joins, aggregations, subqueries), and optimizing database performance with indexes and views.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp