KEMBAR78
DL - LR - 1.ipynb - Colab | PDF | Regression Analysis | Statistical Theory
0% found this document useful (0 votes)
34 views5 pages

DL - LR - 1.ipynb - Colab

The document outlines a project to develop a Linear Regression Model for predicting house prices in the USA using a provided dataset. It includes steps for data import, exploratory data analysis, model training, and evaluation. Key features used for prediction include average area income, house age, number of rooms, number of bedrooms, and area population.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views5 pages

DL - LR - 1.ipynb - Colab

The document outlines a project to develop a Linear Regression Model for predicting house prices in the USA using a provided dataset. It includes steps for data import, exploratory data analysis, model training, and evaluation. Key features used for prediction include average area income, house age, number of rooms, number of bedrooms, and area population.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

4/25/25, 9:12 AM DL_LR_1.

ipynb - Colab

1. Problem Statement – Real estate agents want help to predict the house price for
regions in the USA. He gave you the dataset to work on and you decided to use the
Linear Regression Model. Create a model that will help him to estimate what the house
would sell for. URL for a dataset: https://github.com/huzaifsayed/Linear-Regression-
Model-for-House-PricePrediction/blob/master/USA_Housing.csv

keyboard_arrow_down Linear Regression Machine Learning Project for House Price Prediction
keyboard_arrow_down Import Libraries

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

keyboard_arrow_down Importing Data and Checking out.

HouseDF = pd.read_csv('USA_Housing.csv')

HouseDF.head()

Avg. Area Avg. Area Avg. Area Avg. Area Number Area
Price Address
Income House Age Number of Rooms of Bedrooms Population

208 Michael Ferry Apt.


0 79545.458574 5.682861 7.009188 4.09 23086.800503 1.059034e+06
674\nLaurabury, NE 3701...

188 Johnson Views Suite 079\nLake


1 79248.642455 6.002900 6.730821 3.09 40173.072174 1.505891e+06
Kathleen, CA...

9127 Elizabeth
2 61287.067179 5.865890 8.512727 5.13 36882.159400 1.058988e+06
Stravenue\nDanieltown, WI 06482...

HouseDF.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Avg. Area Income 5000 non-null float64
1 Avg. Area House Age 5000 non-null float64
2 Avg. Area Number of Rooms 5000 non-null float64
3 Avg. Area Number of Bedrooms 5000 non-null float64
4 Area Population 5000 non-null float64
5 Price 5000 non-null float64
6 Address 5000 non-null object
dtypes: float64(6), object(1)
memory usage: 273.6+ KB

HouseDF.describe()

Avg. Area Avg. Area House Avg. Area Number of Avg. Area Number of Area
Price
Income Age Rooms Bedrooms Population

count 5000.000000 5000.000000 5000.000000 5000.000000 5000.000000 5.000000e+03

mean 68583.108984 5.977222 6.987792 3.981330 36163.516039 1.232073e+06

std 10657.991214 0.991456 1.005833 1.234137 9925.650114 3.531176e+05

min 17796.631190 2.644304 3.236194 2.000000 172.610686 1.593866e+04

25% 61480.562388 5.322283 6.299250 3.140000 29403.928702 9.975771e+05

50% 68804.286404 5.970429 7.002902 4.050000 36199.406689 1.232669e+06

75% 75783.338666 6.650808 7.665871 4.490000 42861.290769 1.471210e+06

max 107701.748378 9.519088 10.759588 6.500000 69621.713378 2.469066e+06

https://colab.research.google.com/drive/1XryAyE8KlmqLqHl2drS7VFsYKN47tyod#scrollTo=qTfL0zUSCRyn&printMode=true 1/5
4/25/25, 9:12 AM DL_LR_1.ipynb - Colab
HouseDF.columns

Index(['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population', 'Price', 'Address'],
dtype='object')

sns.pairplot(HouseDF, diag_kind='kde')
plt.show()

https://colab.research.google.com/drive/1XryAyE8KlmqLqHl2drS7VFsYKN47tyod#scrollTo=qTfL0zUSCRyn&printMode=true 2/5
4/25/25, 9:12 AM DL_LR_1.ipynb - Colab

keyboard_arrow_down Exploratory Data Analysis for House Price Prediction


sns.histplot(HouseDF['Price'], kde=True)
plt.show()

https://colab.research.google.com/drive/1XryAyE8KlmqLqHl2drS7VFsYKN47tyod#scrollTo=qTfL0zUSCRyn&printMode=true 3/5
4/25/25, 9:12 AM DL_LR_1.ipynb - Colab

corr = HouseDF.select_dtypes(include='number').corr()['Price'].drop('Price')
corr.sort_values().plot(kind='barh', figsize=(8, 5), title='Correlation with Price')
plt.show()

keyboard_arrow_down Training a Linear Regression Model


X and y List

X = HouseDF[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population']]

y = HouseDF['Price']

keyboard_arrow_down Split Data into Train, Test

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=101)

keyboard_arrow_down Creating and Training the LinearRegression Model


from sklearn.linear_model import LinearRegression

https://colab.research.google.com/drive/1XryAyE8KlmqLqHl2drS7VFsYKN47tyod#scrollTo=qTfL0zUSCRyn&printMode=true 4/5
4/25/25, 9:12 AM DL_LR_1.ipynb - Colab
lm = LinearRegression()

lm.fit(X_train,y_train)

▾ LinearRegression
LinearRegression()

keyboard_arrow_down LinearRegression Model Evaluation


print(lm.intercept_)

-2640159.7968526953

coeff_df = pd.DataFrame(lm.coef_,X.columns,columns=['Coefficient'])
coeff_df

Coefficient

Avg. Area Income 21.528276

Avg. Area House Age 164883.282027

Avg. Area Number of Rooms 122368.678027

Avg. Area Number of Bedrooms 2233.801864

Area Population 15.150420

keyboard_arrow_down Predictions from our Linear Regression Model


predictions = lm.predict(X_test)

plt.scatter(y_test,predictions)
plt.show()

https://colab.research.google.com/drive/1XryAyE8KlmqLqHl2drS7VFsYKN47tyod#scrollTo=qTfL0zUSCRyn&printMode=true 5/5

You might also like