KEMBAR78
Multiple Linear Regression | PDF | Regression Analysis | Dependent And Independent Variables
0% found this document useful (0 votes)
6 views3 pages

Multiple Linear Regression

The document discusses Multiple Linear Regression, focusing on predicting house prices based on features such as area, bedrooms, and age. It provides a dataset and outlines various Python programs for data handling, model training, and price prediction. The document also includes example outputs for specific house price predictions.

Uploaded by

ssstutuorial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views3 pages

Multiple Linear Regression

The document discusses Multiple Linear Regression, focusing on predicting house prices based on features such as area, bedrooms, and age. It provides a dataset and outlines various Python programs for data handling, model training, and price prediction. The document also includes example outputs for specific house price predictions.

Uploaded by

ssstutuorial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Multiple Linear Regression

1. Multiple Linear Regression


Multiple Linear Regression explains the relationship between a single dependent
continuous variable and more than one independent variable.

2. Problem statement
Assuming that we are planning to buy a new house and need to predict the price of a
house.
Here price depends on area (square feet), bedrooms, and age of the home (in years).
Given these prices we have to predict prices of new homes based on area, bedrooms,
and age.
Given these home prices find out price of a home that has:

 3000 sqr ft area, 3 bedrooms, 40 years old


 2500 sqr ft area, 4 bedrooms, 5 years old

3. Dataset

We use homeprices1.csv which contains:

Area Bedrooms Age Price


2600 3 20 550000
3000 4 15 565000
3200 NaN 18 610000
3600 3 30 595000
4000 5 8 760000
4100 6 8 810000

4. Machine Learning Terminology

4.1 Features and label

 Area, Bedrooms, Age → Independent variables (features)


 Price → Dependent variable (label)

4.2 Models
A machine learning model is a formula that predicts a label from features.
4.3 Prediction
The prediction is the output of the model.

5. Programs

demo1.py – Loading dataset

import pandas as pd
df = pd.read_csv("homeprices1.csv")
print(df)

demo2.py – Finding median of bedrooms

import pandas as pd
df = pd.read_csv("homeprices1.csv")
print("Mean of the bedrooms")
print(df.bedrooms.median())
# Output: 4.0

demo3.py – Fill NA with median

import pandas as pd
df = pd.read_csv("homeprices1.csv")
print("Filling missing value with mean\n")
m = df.bedrooms.median()
df.bedrooms = df.bedrooms.fillna(m)
print(df)

demo4.py – Model training

import pandas as pd
from sklearn.linear_model import LinearRegression

df = pd.read_csv("homeprices1.csv")
m = df.bedrooms.median()
df.bedrooms = df.bedrooms.fillna(m)
a = df.drop('price', axis='columns')

reg = LinearRegression()
reg.fit(a.values, df.price)
print("Model trained")

demo5.py – Finding intercept

import pandas as pd
from sklearn.linear_model import LinearRegression
df = pd.read_csv("homeprices1.csv")
m = df.bedrooms.median()
df.bedrooms = df.bedrooms.fillna(m)
a = df.drop('price', axis='columns')
reg = LinearRegression()
reg.fit(a.values, df.price)
print("Intercept is:")
print(reg.intercept_)
# Output: 221323.00186540408
demo6.py – Finding coefficients

import pandas as pd
from sklearn.linear_model import LinearRegression

df = pd.read_csv("homeprices1.csv")
m = df.bedrooms.median()
df.bedrooms = df.bedrooms.fillna(m)
a = df.drop('price', axis='columns')

reg = LinearRegression()
reg.fit(a.values, df.price)

print("Coefficients are:")
print(reg.coef_)
# Output: [112.06244194 23388.88007794 -3231.71790863]

demo7.py – Predict price for 3000 sq ft, 3 bedrooms, 40 years old

import pandas as pd
from sklearn.linear_model import LinearRegression

df = pd.read_csv("homeprices1.csv")
m = df.bedrooms.median()
df.bedrooms = df.bedrooms.fillna(m)
a = df.drop('price', axis='columns')

reg = LinearRegression()
reg.fit(a.values, df.price)

print("Price of home with 3000 sqr ft area, 3 bedrooms, 40 year old")


print(reg.predict([[3000, 3, 40]]))
# Output: [498408.25158031]

demo8.py –Manual calculation of price

import pandas as pd
from sklearn.linear_model import LinearRegression

df = pd.read_csv("homeprices1.csv")
m = df.bedrooms.median()
df.bedrooms = df.bedrooms.fillna(m)
a = df.drop('price', axis='columns')
reg = LinearRegression()
reg.fit(a.values, df.price)
print("Price of home with 3000 sqr ft area, 3 bedrooms, 40 year old")
b = 112.06244194*3000 + 23388.88007794*3 + (-3231.71790863)*40 + 221323.00186540384
print(b)
Output: 498408.25158031

You might also like