0% found this document useful (0 votes)

26 views6 pages

CCD - Ipynb - Colab

The document outlines a machine learning project focused on predicting credit card default using various models including Logistic Regression, Decision Tree, Random Forest, SVM, and a Neural Network. It details the data preprocessing steps, model training, and evaluation metrics such as accuracy and classification reports for each model. The results indicate varying performance, with Random Forest achieving the highest accuracy of 0.505.

Uploaded by

megha.s238156202

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views6 pages

CCD - Ipynb - Colab

Uploaded by

megha.s238156202

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

# Import essential libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from keras.models import Sequential
from keras.layers import Dense

import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# Download the dataset directly from the UCI repository

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00350/default%20of%20credit%20card%20clients.xls'
df = pd.read_excel(url, header=1) # The first row contains column headers

# Display the first few rows of the DataFrame

print(df.head())

# Handle categorical columns (SEX, EDUCATION, and MARRIAGE)

df['SEX'] = df['SEX'].map({1: 'Male', 2: 'Female'})
df['EDUCATION'] = df['EDUCATION'].map({1: 'Graduate', 2: 'University', 3: 'High school', 4: 'Other'})
df['MARRIAGE'] = df['MARRIAGE'].map({1: 'Married', 2: 'Single', 3: 'Other'})

# Convert these columns to numerical values for model training

df['SEX'] = df['SEX'].map({'Male': 0, 'Female': 1})
df['EDUCATION'] = df['EDUCATION'].map({'Graduate': 0, 'University': 1, 'High school': 2, 'Other': 3})
df['MARRIAGE'] = df['MARRIAGE'].map({'Married': 0, 'Single': 1, 'Other': 2})

# Split data into features (X) and target variable (y)

X = df.drop(['ID', 'default payment next month'], axis=1) # Drop ID and target column
y = df['default payment next month'] # The target variable (whether the user defaulted)

# Normalize the data (standardization)

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split data into train and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

ID LIMIT_BAL SEX EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 \

0 1 20000 2 2 1 24 2 2 -1 -1
1 2 120000 2 2 2 26 -1 2 0 0
2 3 90000 2 2 2 34 0 0 0 0
3 4 50000 2 2 1 37 0 0 0 0
4 5 50000 1 2 1 57 -1 0 -1 0

... BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 \

0 ... 0 0 0 0 689 0
1 ... 3272 3455 3261 0 1000 1000
2 ... 14331 14948 15549 1518 1500 1000
3 ... 28314 28959 29547 2000 2019 1200
4 ... 20940 19146 19131 2000 36681 10000

PAY_AMT4 PAY_AMT5 PAY_AMT6 default payment next month

0 0 0 0 1
1 1000 0 2000 1
2 1000 1000 5000 0
3 1100 1069 1000 0
4 9000 689 679 0

[5 rows x 25 columns]

# Logistic Regression Model

logreg = LogisticRegression()
logreg.fit(X_train, y_train)

# Predicting the test set results

y_pred_logreg = logreg.predict(X_test)

# Evaluate the Logistic Regression model

print("Logistic Regression Accuracy: ", accuracy_score(y_test, y_pred_logreg))
print("Logistic Regression Classification Report:\n", classification_report(y_test, y_pred_logreg))

# Confusion Matrix for Logistic Regression

sns.heatmap(confusion_matrix(y_test, y_pred_logreg), annot=True, fmt='d', cmap='Blues', xticklabels=['No Default', 'Default'], yticklabe
plt.title('Logistic Regression Confusion Matrix')
plt.show()

Logistic Regression Accuracy: 0.47

Logistic Regression Classification Report:
precision recall f1-score support

0 0.47 0.33 0.39 102

1 0.47 0.61 0.53 98

accuracy 0.47 200

macro avg 0.47 0.47 0.46 200
weighted avg 0.47 0.47 0.46 200

 

# Decision Tree Model

dtree = DecisionTreeClassifier(random_state=42)
dtree.fit(X_train, y_train)

# Predicting the test set results

y_pred_dtree = dtree.predict(X_test)

# Evaluate the Decision Tree model

print("Decision Tree Accuracy: ", accuracy_score(y_test, y_pred_dtree))
print("Decision Tree Classification Report:\n", classification_report(y_test, y_pred_dtree))

# Confusion Matrix for Decision Tree

sns.heatmap(confusion_matrix(y_test, y_pred_dtree), annot=True, fmt='d', cmap='Blues', xticklabels=['No Default', 'Default'], yticklabel
plt.title('Decision Tree Confusion Matrix')
plt.show()
Decision Tree Accuracy: 0.495
Decision Tree Classification Report:
precision recall f1-score support

0 0.51 0.46 0.48 102

1 0.49 0.53 0.51 98

accuracy 0.49 200

macro avg 0.50 0.50 0.49 200
weighted avg 0.50 0.49 0.49 200

 

# Random Forest Model

rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)

# Predicting the test set results

y_pred_rf = rf.predict(X_test)

# Evaluate the Random Forest model

print("Random Forest Accuracy: ", accuracy_score(y_test, y_pred_rf))
print("Random Forest Classification Report:\n", classification_report(y_test, y_pred_rf))

# Confusion Matrix for Random Forest

sns.heatmap(confusion_matrix(y_test, y_pred_rf), annot=True, fmt='d', cmap='Blues', xticklabels=['No Default', 'Default'], yticklabels=[
plt.title('Random Forest Confusion Matrix')
plt.show()
Random Forest Accuracy: 0.505
Random Forest Classification Report:
precision recall f1-score support

0 0.52 0.36 0.43 102

1 0.50 0.65 0.56 98

accuracy 0.51 200

macro avg 0.51 0.51 0.50 200
weighted avg 0.51 0.51 0.49 200

 

# SVM Model
svm = SVC(kernel='linear', random_state=42)
svm.fit(X_train, y_train)

# Predicting the test set results

y_pred_svm = svm.predict(X_test)

# Evaluate the SVM model

print("SVM Accuracy: ", accuracy_score(y_test, y_pred_svm))
print("SVM Classification Report:\n", classification_report(y_test, y_pred_svm))

# Confusion Matrix for SVM

sns.heatmap(confusion_matrix(y_test, y_pred_svm), annot=True, fmt='d', cmap='Blues', xticklabels=['No Default', 'Default'], yticklabels=
plt.title('SVM Confusion Matrix')
plt.show()
SVM Accuracy: 0.465
SVM Classification Report:
precision recall f1-score support

0 0.47 0.37 0.42 102

1 0.46 0.56 0.51 98

accuracy 0.47 200

macro avg 0.47 0.47 0.46 200
weighted avg 0.47 0.47 0.46 200

 

# Neural Network Model

ann = Sequential()

# Input Layer
ann.add(Dense(units=64, activation='relu', input_dim=X_train.shape[1]))

# Hidden Layer
ann.add(Dense(units=32, activation='relu'))

# Output Layer
ann.add(Dense(units=1, activation='sigmoid')) # Sigmoid for binary classification

# Compile the ANN model

ann.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model

ann.fit(X_train, y_train, epochs=20, batch_size=32, verbose=1)

# Predicting the test set results

y_pred_ann = (ann.predict(X_test) > 0.5)

# Evaluate the ANN model

print("Neural Network Accuracy: ", accuracy_score(y_test, y_pred_ann))
print("Neural Network Classification Report:\n", classification_report(y_test, y_pred_ann))

# Confusion Matrix for Neural Network

sns.heatmap(confusion_matrix(y_test, y_pred_ann), annot=True, fmt='d', cmap='Blues', xticklabels=['No Default', 'Default'], yticklabels=
plt.title('Neural Network Confusion Matrix')
plt.show()
/usr/local/lib/python3.11/dist-packages/keras/src/layers/core/dense.py:87: UserWarning: Do not pass an `input_shape`/`input_dim` arg
super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 6s 7ms/step - accuracy: 0.5537 - loss: 0.7256
Epoch 2/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.5628 - loss: 0.6892
Epoch 3/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 1s 20ms/step - accuracy: 0.5913 - loss: 0.6700
Epoch 4/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 1s 24ms/step - accuracy: 0.6065 - loss: 0.6442
Epoch 5/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 1s 18ms/step - accuracy: 0.6539 - loss: 0.6362
Epoch 6/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - accuracy: 0.6412 - loss: 0.6340
Epoch 7/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 10ms/step - accuracy: 0.6932 - loss: 0.6016
Epoch 8/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7007 - loss: 0.6024
Epoch 9/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - accuracy: 0.7290 - loss: 0.5780
Epoch 10/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - accuracy: 0.6993 - loss: 0.5842
Epoch 11/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - accuracy: 0.7370 - loss: 0.5601
Epoch 12/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.7592 - loss: 0.5485
Epoch 13/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.7631 - loss: 0.5508
Epoch 14/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.7523 - loss: 0.5306
Epoch 15/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - accuracy: 0.7679 - loss: 0.5294
Epoch 16/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - accuracy: 0.8134 - loss: 0.4902
Epoch 17/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - accuracy: 0.8027 - loss: 0.4838
Epoch 18/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 10ms/step - accuracy: 0.8028 - loss: 0.4866
Epoch 19/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - accuracy: 0.8393 - loss: 0.4661
Epoch 20/20
25/25 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8603 - loss: 0.4406
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 10ms/step
Neural Network Accuracy: 0.495
Neural Network Classification Report:
precision recall f1-score support

0 0.50 0.50 0.50 102

1 0.48 0.49 0.49 98

accuracy 0.49 200

macro avg 0.49 0.49 0.49 200
weighted avg 0.50 0.49 0.50 200

Loan Default Prediction System 1753830667
No ratings yet
Loan Default Prediction System 1753830667
11 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
7 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
Openlab 1
No ratings yet
Openlab 1
17 pages
ML Yogesh
No ratings yet
ML Yogesh
23 pages
Karmbir 19 ML
No ratings yet
Karmbir 19 ML
20 pages
Classification
No ratings yet
Classification
3 pages
Dsbda 5
No ratings yet
Dsbda 5
4 pages
DA Programs
No ratings yet
DA Programs
44 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
ML Lab
No ratings yet
ML Lab
23 pages
Datascience PR 6 Veda
No ratings yet
Datascience PR 6 Veda
6 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
ML Keshav
No ratings yet
ML Keshav
23 pages
Najir Shaikh Practical 4
No ratings yet
Najir Shaikh Practical 4
4 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Payal Practical5 Edited
No ratings yet
Payal Practical5 Edited
5 pages
Aiml 5-8
No ratings yet
Aiml 5-8
19 pages
ML Internal Answers
No ratings yet
ML Internal Answers
9 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
Lab On ML Print-Set-2022
No ratings yet
Lab On ML Print-Set-2022
10 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
Telecom Churn Proj
No ratings yet
Telecom Churn Proj
4 pages
ML Assignment 4
No ratings yet
ML Assignment 4
7 pages
ML Lab Prgms Split
No ratings yet
ML Lab Prgms Split
3 pages
MLA Lab Record (2024)
No ratings yet
MLA Lab Record (2024)
47 pages
Ann Experiential Learning
No ratings yet
Ann Experiential Learning
43 pages
Assignment 9
No ratings yet
Assignment 9
2 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
AI & ML Lab Practical Guide
No ratings yet
AI & ML Lab Practical Guide
36 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Import As Import As From Import From Import From Import From Import
No ratings yet
Import As Import As From Import From Import From Import From Import
4 pages
05 E RandomForest LoanData
No ratings yet
05 E RandomForest LoanData
8 pages
Dsbda 10
No ratings yet
Dsbda 10
5 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
13 pages
Da Program
No ratings yet
Da Program
18 pages
Final-12-Lab Programs
No ratings yet
Final-12-Lab Programs
30 pages
ML Lab 01999676272
No ratings yet
ML Lab 01999676272
12 pages
S6 - Data Mining Lab Experiments (Except 1)
No ratings yet
S6 - Data Mining Lab Experiments (Except 1)
6 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
Big Data Practical
No ratings yet
Big Data Practical
20 pages
I Avaliação Parcial - 25.0 PTS - Gabarito
No ratings yet
I Avaliação Parcial - 25.0 PTS - Gabarito
9 pages
MD - Sajedul Islam - Assaignment - 02
No ratings yet
MD - Sajedul Islam - Assaignment - 02
11 pages
Da 012307
No ratings yet
Da 012307
8 pages
ML Lab 8
No ratings yet
ML Lab 8
9 pages
ML Prac1-10
No ratings yet
ML Prac1-10
32 pages
Aiml Exp 7
No ratings yet
Aiml Exp 7
10 pages
21CSC305P ML - Lab Programs 1 - 9
No ratings yet
21CSC305P ML - Lab Programs 1 - 9
36 pages
ML Record
No ratings yet
ML Record
14 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
ML Lab
No ratings yet
ML Lab
29 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
St. John College of Engineering and Management, Palghar - Maharashtra
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
11 pages
'Classified Data': Import As Import As Import As Import As
No ratings yet
'Classified Data': Import As Import As Import As Import As
3 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
MSML Project 1
No ratings yet
MSML Project 1
8 pages
CCEX1
No ratings yet
CCEX1
6 pages
CSE DS T3 54 Megha Kapadne EXP 05 AI
No ratings yet
CSE DS T3 54 Megha Kapadne EXP 05 AI
5 pages
CCEX7
No ratings yet
CCEX7
8 pages
CCEX9
No ratings yet
CCEX9
5 pages
Predicting Breast Cancer Recurrence Using Machine Learning Techniques: A Systematic Review
No ratings yet
Predicting Breast Cancer Recurrence Using Machine Learning Techniques: A Systematic Review
41 pages
Darcet 2023 Vision Transformers Need Registers
No ratings yet
Darcet 2023 Vision Transformers Need Registers
16 pages
Analysis
No ratings yet
Analysis
2 pages
Daxaz Semester1
No ratings yet
Daxaz Semester1
11 pages
Da Poster
No ratings yet
Da Poster
1 page
Eng2 12298 PDF
No ratings yet
Eng2 12298 PDF
24 pages
Perceptron Neural Network Program
No ratings yet
Perceptron Neural Network Program
3 pages
Discriminant Analysis: Presented By: Shyam Raj Ajay Poonia Yatin Kanchan Sheersh Shrivastav Nisha Rani Rohit Bebarta
No ratings yet
Discriminant Analysis: Presented By: Shyam Raj Ajay Poonia Yatin Kanchan Sheersh Shrivastav Nisha Rani Rohit Bebarta
28 pages
What Is Backpropagation
No ratings yet
What Is Backpropagation
8 pages
Feed-Forward Neural Networks (Part 2: Learning)
No ratings yet
Feed-Forward Neural Networks (Part 2: Learning)
17 pages
Approval Sheet: Isabela State University
No ratings yet
Approval Sheet: Isabela State University
17 pages
Syllabus 2025 Final
No ratings yet
Syllabus 2025 Final
4 pages
ESDL Lab Manual
No ratings yet
ESDL Lab Manual
7 pages
Exam Overview: GCP Data Engineer
100% (5)
Exam Overview: GCP Data Engineer
12 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
53 pages
Car Popularity Prediction
No ratings yet
Car Popularity Prediction
5 pages
DL MID2 Bit Bank 2024-25
No ratings yet
DL MID2 Bit Bank 2024-25
25 pages
101905CS502H - Neural Networks and Deep Learning - Model Question Paper
100% (2)
101905CS502H - Neural Networks and Deep Learning - Model Question Paper
4 pages
Convolutional Neural Networks For Malaria Detection
100% (2)
Convolutional Neural Networks For Malaria Detection
22 pages
AI Release & Social Impact Report
No ratings yet
AI Release & Social Impact Report
34 pages
MM Release Procedure in Purchasing Documents
100% (1)
MM Release Procedure in Purchasing Documents
9 pages
Nascom
No ratings yet
Nascom
204 pages
Towards Early Stroke Prediction Detecting Hidden Patterns With Data Analytics
No ratings yet
Towards Early Stroke Prediction Detecting Hidden Patterns With Data Analytics
8 pages
AI's Role in Banking Transformation
No ratings yet
AI's Role in Banking Transformation
9 pages
Great LEarning Weekly Quiz - Bagging and Random Forest
100% (4)
Great LEarning Weekly Quiz - Bagging and Random Forest
5 pages
2004 - Rakha - Development of VT-Micro Model For Estimating Emissions
No ratings yet
2004 - Rakha - Development of VT-Micro Model For Estimating Emissions
26 pages
Iit M Diploma Quiz2 Exam QPD2
No ratings yet
Iit M Diploma Quiz2 Exam QPD2
221 pages
Sih Report
No ratings yet
Sih Report
33 pages
Clustering Analysis (Unsupervised)
No ratings yet
Clustering Analysis (Unsupervised)
6 pages
Diabetes Prediction Using Data Mining Te
No ratings yet
Diabetes Prediction Using Data Mining Te
9 pages

CCD - Ipynb - Colab

Uploaded by

CCD - Ipynb - Colab

Uploaded by

# Import essential libraries

# Download the dataset directly from the UCI repository

# Display the first few rows of the DataFrame

# Handle categorical columns (SEX, EDUCATION, and MARRIAGE)

# Convert these columns to numerical values for model training

# Split data into features (X) and target variable (y)

# Normalize the data (standardization)

ID LIMIT_BAL SEX EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 \

... BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 \

PAY_AMT4 PAY_AMT5 PAY_AMT6 default payment next month

# Logistic Regression Model

# Predicting the test set results

# Evaluate the Logistic Regression model

# Confusion Matrix for Logistic Regression

Logistic Regression Accuracy: 0.47

0 0.47 0.33 0.39 102

accuracy 0.47 200

# Decision Tree Model

# Predicting the test set results

# Evaluate the Decision Tree model

# Confusion Matrix for Decision Tree

0 0.51 0.46 0.48 102

accuracy 0.49 200

# Random Forest Model

# Predicting the test set results

# Evaluate the Random Forest model

# Confusion Matrix for Random Forest

0 0.52 0.36 0.43 102

accuracy 0.51 200

# Predicting the test set results

# Evaluate the SVM model

# Confusion Matrix for SVM

0 0.47 0.37 0.42 102

accuracy 0.47 200

# Neural Network Model

# Compile the ANN model

# Train the model

# Predicting the test set results

# Evaluate the ANN model

# Confusion Matrix for Neural Network

0 0.50 0.50 0.50 102

accuracy 0.49 200

You might also like