Leyan0109/Python_Classification-Model-Credit-RiskPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star1

Classification models to predict 2-year serious delinquency risk using raw and WOE-transformed datasets with logistic regression and machine learning techniques.

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
credit-score-classification-python-code.ipynb		credit-score-classification-python-code.ipynb
credit-score-result.pdf		credit-score-result.pdf
credit_score-raw-data.csv		credit_score-raw-data.csv

Repository files navigation

💳 Credit Risk Prediction

A Comparative Study of Classification Models and WOE-Based Feature Transformation

This project focuses on predicting whether a client will experienceserious delinquency within the next two years using classification models. The dataset was preprocessed to ensure high data quality, including imputation, outlier treatment, and transformation usingWeight of Evidence (WOE) to enhance interpretability and model effectiveness—especially for logistic regression.

Three machine learning models were evaluated to identify the best-performing approach in terms of accuracy, recall, and business relevance.

🧠 Models Compared

Logistic Regression (with and without WOE)
Random Forest Classifier
XGBoost Classifier

🧩 Feature Strategy

Raw Features – Cleaned but untransformed dataset
WOE-Transformed Features – Variables transformed using Weight of Evidence to support interpretability and improve logistic regression performance

🔍 Key Findings

Random Forest delivered the best balance of performance metrics:
- ROC AUC: 0.8569
- Recall: 0.72
- Precision: 0.22
Logistic Regression performed well with WOE-transformed features:
- Recall: 0.70
- Precision: 0.19
- Enabled creation of an interpretablescorecard
XGBoost had the highest precision for non-delinquents but a low recall (0.16), making it less suitable for minimizing false negatives.
Key predictors across models included:
- Revolving Utilization of Unsecured Lines
- Past Due Counts (30–59, 60–89, 90+)
- Age

🛠️ Tools & Libraries Used

Python, Jupyter Notebook
pandas, numpy, seaborn, matplotlib
scikit-learn, XGBoost
WOE Binning tools, scikit-plot

📁 Repository Structure

credit-risk-prediction/

data/ # Raw dataset
notebooks/ # Model training and evaluation (LR, RF, XGBoost)
results/ # Evaluation metrics, ROC curves, confusion matrices, scorecard
README.md # Project overview

📌 Conclusion

This project demonstrates that:

Random Forest is the most effective model for credit delinquency prediction, offering high recall and balanced precision.
Logistic Regression with WOE remains highly interpretable and practical for deployment via scorecards.
CombiningRandom Forest's accuracy withLogistic Regression's explainability provides a strong, business-ready solution for credit scoring systems.

About

Classification models to predict 2-year serious delinquency risk using raw and WOE-transformed datasets with logistic regression and machine learning techniques.

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

💳 Credit Risk Prediction

A Comparative Study of Classification Models and WOE-Based Feature Transformation

🧠 Models Compared

🧩 Feature Strategy

🔍 Key Findings

🛠️ Tools & Libraries Used

📁 Repository Structure

📌 Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

Leyan0109/Python_Classification-Model-Credit-Risk

Folders and files

Latest commit

History

Repository files navigation

💳 Credit Risk Prediction

A Comparative Study of Classification Models and WOE-Based Feature Transformation

🧠 Models Compared

🧩 Feature Strategy

🔍 Key Findings

🛠️ Tools & Libraries Used

📁 Repository Structure

📌 Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages