- Notifications
You must be signed in to change notification settings - Fork0
Loan approval system using svm
License
Kshitij-Shresth/Loan-Status-Prediction
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The pipeline includes preprocessing steps for categorical feature encoding, generating a correlation heatmap for feature interaction analysis, and training a linear SVM to assess the likelihood of loan approval (binary classification). In a market context, this project mimics the decision-making process used by financial institutions for loan risk assessment. By encoding categorical variables such as employment status, property area, and education level, the model transforms qualitative data into quantitative insights. The SVM algorithm is particularly suitable due to its effectiveness in high-dimensional spaces, making it a strong candidate for binary classification problems in finance where margins between approved and denied loans may be subtle.
This framework serves as a prototype for building scalable loan approval models that financial institutions can use to automate risk evaluation, improve accuracy in decision-making, and minimize default rates.
Heatmap Generation: The numerical columns in the dataset are selected, and a heatmap is created to visualize the correlation between the features using seaborn.
numeric_data = data.select_dtypes(include=[np.number])sns.heatmap(numeric_data.corr(), annot=True, cmap='coolwarm', linewidths=0.5)
Feature Encoding: Categorical features are encoded into numerical values for use in machine learning
Loan_Status
: Approved Y (1), Not Approved N (0)
Dependents
: Number of Dependents (3+ -> 4)
Married
: Married (Yes -> 1), Not Married (No -> 0)
Gender
: Male (1), Female (0)
Education
: Graduate (1), Not Graduate (0)
Self_Employed
: Self-Employed (Yes -> 1), Not Self-Employed (No -> 0)
Property_Area
: Urban (2), Rural (0), Semiurban (1)
The dataset is split into training and test sets using an 90-10 split. Stratified sampling is used to maintain the proportion of labels in both sets. The random seed is set to 7 based on the observed training and test mean values.
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.1, stratify=Y, random_state=7)
A Support Vector Classifier (SVC) with a linear kernel is trained on the dataset.
classifier = svm.SVC(kernel='linear')classifier.fit(X_train, Y_train)
After training the model, predictions are made on the training set. The accuracy score is calculated, which achieves approximately 80% accuracy.
training_data_accuracy = accuracy_score(X_train_prediction, Y_train)
About
Loan approval system using svm
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.