KEMBAR78
Hell | PDF | Mean Squared Error | Errors And Residuals
0% found this document useful (0 votes)
30 views33 pages

Hell

The project report details the development of a web-based machine learning application for predicting real estate prices in Bangalore, utilizing a Linear Regression model. It includes data collection, preprocessing, and the creation of an interactive web interface for user engagement. The project aims to provide accurate and real-time property price predictions based on various influential features such as location and property size.

Uploaded by

sanho8142
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views33 pages

Hell

The project report details the development of a web-based machine learning application for predicting real estate prices in Bangalore, utilizing a Linear Regression model. It includes data collection, preprocessing, and the creation of an interactive web interface for user engagement. The project aims to provide accurate and real-time property price predictions based on various influential features such as location and property size.

Uploaded by

sanho8142
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

A

Project Report
on
REAL ESTATE PRICE PREDICTION

Submitted By

R MANJU BHARGAV [R200158]


THOUSIF [R200810]
S SANIYA BEGUM [R201085]

Under the Guidance of


Dr. PENUGONDA RAVI KUMAR
M.E. (IISc Bangalore), Ph.D. (University of Aizu, Japan)
Assistant Professor
Computer Science and Engineering

RAJIV GANDHI UNIVERSITY OF KNOWLEDGE


TECHNOLOGIES (APIIIT)
R K VALLEY, VEMPALLI, YSR (Dist.) – 516330
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
2024-2025
1
Rajiv Gandhi University of Knowledge Technologies
RK Valley, YSR (Dist.), Andhra Pradesh - 516330
________________________________________________________________

CERTIFICATE
This is to certify that the project report titled “REAL ESTATE PRICE
PREDICTION”, carried out by R Manju Bhargav [R200158] ,Thousif
[R200810] and S Saniya Begum [R201085], has been completed under my
guidance and supervision. This report is submitted to the Department of
Computer Science and Engineering in partial fulfilment of the requirements
for the Mini Project, as part of the curriculum for the Bachelor of Technology
in Computer Science and Engineering during the Academic Year 2024–2025.
The work embodied in this report has been reviewed and is found to be in
accordance with the academic requirements of the University.

Signature of Internal Guide Signature of HOD


Dr. P. RAVI KUMAR, M.E, Ph.D. Dr. CH. RATNA KUMARI,
Assistant Professor Head of the Department
Computer Science and Engineering Computer Science and Engineering
RGUKT RK VALLEY RGUKT RK VALLEY

Signature of External Examiner

2
DECLARATION

We hereby declare that the project work titled “REAL ESTATE PRICE
PREDICTION ”, submitted to the Department of Computer Science and
Engineering, is a genuine piece of work carried out by our team under the
guidance of Dr. P. Ravi Kumar. This report is submitted in partial fulfilment of
the requirements for the Mini Project, as part of the curriculum for the
Bachelor of Technology in Computer Science and Engineering during the
Academic Year 2024–2025.

We further declare that this work has not been submitted elsewhere for
the award of any other degree or diploma.

Project Team
R Manju Bhargav [R200158]
Thousif [R200810]
S Saniya Begum [R201085]

3
ACKNOWLEDGEMENT

We would like to express our sincere gratitude to all those who supported and
guided us throughout the successful completion of this mini-project titled
“REAL ESTATE PRICE PREDICTION .”
We are deeply thankful to our project guide, Dr. P. Ravi Kumar M.E, Ph.D.,
for his invaluable guidance, constant encouragement, and constructive
suggestions at every stage of the project. His mentorship played a crucial role in
shaping the direction and outcome of our work.
We extend our heartfelt thanks to Dr. Ch. Ratna Kumari M.Tech, Ph.D., Head
of the Department, Computer Science and Engineering, for her continuous
support and for providing us with the resources and academic environment
necessary for this project.
We are also grateful to Prof. A. V. S. S. Kumara Swami Gupta M.Tech, Ph.D.,
Director of RGUKT RK Valley, for his encouragement and for fostering a
culture of research and innovation.
Our sincere thanks go to the faculty members of the Department of
Computer Science and Engineering for their guidance, and academic support.
Finally, we appreciate the collaborative efforts of our team members and the
support extended by our peers, which made this project a rewarding learning
experience.

R Manju Bhargav [R200158]


Thousif [R200810]
S Saniya Begum [R201085]

4
CONTENTS

CERTIFICATE 2

DECLARATION 3

ACKNOWLEDGEMENT 4

CH.NO INDEX PAGE


. NO.
ABSTRACT 6
LIST OF FIGURES 7
1 INTRODUCTION 8
1.1 Background and Importance of Noise Suppression 8
1.2 Primary Challenge and the Need for Audio Refinement 8
1.3 Inspiration for Solutions in Deep Noise Suppression 9
1.4 Existing Solutions for Noise Suppression 9
1.5 Literature Survey 10
1.6 Proposed Solution and Contribution 12
1.7 Conclusion 13
2 LITERATURE REVIEW 14
2.1 Introduction to Speech Denoising 14
2.2 Traditional Noise Suppression Methods 15
2.3 Machine Learning Based Approaches 15
2.4 Deep Learning for DNS 15
2.5 Related Work 16
2.6 Why CNN For This Project 17
3 SYSTEM ARCHITECTURE AND DESIGN 19
3.1 System Overview 19
3.2 Module-Wise Breakdown 20
3.3 Module-Wise Breakdown 21
3.4 Technologies Used 22
3.5 Design Considerations 22
4 METHODOLOGY 23

5
4.1 Algorithm and Pipeline 23
5 OPTIMISING DNS USING CNN ARCHITECTURE 28
5.1 Data Augmentation via Synthetic Noisy Data 28
5.2 Time Domain Learning Strategy 29
5.3 Modular Design with Torchaudio and Librosa 29
5.4 Integration with Gradio Web UI 29
6 RESULTS AND DISCUSSION 30
6.1 Metrics Used 30
6.2 Metrics 31
6.3 Limitations 31
6.4 Summary 32
7 CONCLUSION AND FUTURE ENHANCEMENTS 33
7.1 Conclusion 33
7.2 Future Enhancements 33
REFERENCES 35

6
ABSTRACT

The Real Estate Price Predictor is a web-based machine learning application designed
to estimate residential property prices in Bangalore. The project follows a structured pipeline
beginning with data collection, data cleaning, and outlier removal, followed by feature
engineering to select the most influential variables such as location, area (square footage),
number of bedrooms (BHK), and bathrooms. A Linear Regression model was developed and
trained on the processed dataset to predict housing prices accurately.

The trained model was deployed using a Python Flask server, forming the backend of
an interactive web application. The frontend interface allows users to input property details
and obtain real-time price predictions. Additional features such as a dynamic dashboard, a
basic FAQ chatbot, and an interactive map for location-based selection have been
incorporated to enhance usability and user engagement. Location data, along with
geographical coordinates, is managed through a structured JSON file to support map
functionalities.

The project effectively demonstrates the complete lifecycle of a machine learning


solution, from data preparation to model deployment and user interface design, providing a
practical and scalable tool for real estate price estimation.

7
LIST OF FIGURES

Figure No. Title Page No.


Figure 1.3 Project Architecture Diagram 10
Figure 3.2 Outlier Removal 16
Figure 4.1 Model Building Diagram 19
Figure 5.1 WEB App UI (before execution) 23
Figure 5.3 WEB App UI (after execution) 25

8
CHAPTER 1
INTRODUCTION

1.1 Purpose of the Project:


The real estate market in Bangalore has witnessed tremendous growth in recent years.
For buyers, sellers, and real estate professionals, accurate property price prediction is
essential for making informed decisions. However, traditional methods of property valuation
are time-consuming and often not as precise as desired. The purpose of this project is to
develop an intelligent real estate price prediction system that uses machine learning to predict
property prices based on various features such as area, number of bedrooms (BHK), and
location. Additionally, a web application is developed to make the system accessible to users,
allowing them to input property details and receive accurate price predictions in real-time.

1.2 Problem Statement:

The problem addressed by this project is the difficulty in predicting property prices in
Bangalore, which vary significantly depending on location, area, type of property, and other
factors. Existing systems do not provide a dynamic and interactive solution for users to get
real-time price predictions based on these factors. This project aims to fill that gap by
providing an easy-to-use web interface that allows users to input key features and receive an
estimated property price based on a machine learning model.

1.3 Objectives:

The primary objectives of this project are:

1.Data Collection and Preprocessing: To gather and clean the dataset, ensuring that only
relevant and accurate data is used for training the model.

2.Model Building: To develop a predictive model using Linear Regression to estimate


property prices based on selected features like area, number of bedrooms, and location.

3.Web Application Development: To create a user-friendly web application using Flask that
allows users to input property details and receive price predictions.

4.Location-based Mapping: To integrate an interactive map feature that visually marks the
location selected by the user on the map after predicting the price.

9
1.4 Scope of the Project:

This project focuses specifically on the real estate market in Bangalore. The features
considered for predicting the property price include the area of the property (in square feet),
number of bedrooms (BHK), and location. The model is built using Linear Regression and
evaluated based on performance metrics such as R² and Mean Squared Error (MSE). The
application is deployed using Flask, with the trained model being loaded and served via
Pickle. The location-based mapping is integrated as a feature in the web application, which
allows users to select their location from a dropdown menu, and after receiving the price
prediction, the location is marked on an interactive map.

1.5 Methodology Overview:


The methodology adopted for this project consists of the following key steps:

1.Data Collection: The dataset is gathered from various online real estate sources, which
includes details about property prices, area, BHK, and location.

2.Data Preprocessing: The data is cleaned using Pandas to handle missing values, remove
outliers, and normalize the data for better model performance.

10
3.Feature Engineering: Key features such as location are encoded numerically to make them
usable in the model, and feature selection techniques are applied to retain the most influential
features.

4.Model Building: The Linear Regression model is trained using Scikit-learn and evaluated
for accuracy.

5.Web Application Development: The web application is developed using Flask to interact
with users and provide price predictions.

6.Deployment: The trained model is serialized using Pickle and integrated into the web app
to handle real-time price prediction.

11
CHAPTER 2
LITERATURE REVIEW

2.1 Overview of Real Estate Price Prediction

Real estate price prediction is a critical task for both buyers and sellers in the housing market.
Accurate pricing models help potential buyers make informed decisions and assist sellers in setting
competitive prices for their properties. Traditionally, property prices have been determined based on
market trends, location, and expert opinions. However, with the advent of data science and machine
learning, automated systems have been developed to predict property prices more accurately by
analyzing large datasets containing various factors such as property features, location, and historical
prices.

Machine learning models, specifically regression models, have become popular in predicting
real estate prices. These models use input variables (features) such as area, number of bedrooms
(BHK), location, amenities, and neighborhood characteristics to predict the price of a property.
Various machine learning techniques, such as Linear Regression, Decision Trees, and Lasso
Regression, have been explored in previous studies for this purpose.

2.2 Techniques in Real Estate Price Prediction

1.Linear Regression:

Linear Regression has been widely used in the real estate domain due to its simplicity and
interpretability. It works well when the relationship between the target variable (property price) and
independent variables (property features) is linear. Several studies have shown that Linear
Regression can perform effectively in predicting property prices, especially when there is a clear
correlation between features like area, BHK, and price. It also allows for easy interpretation of the
impact of individual features on the final price, making it a popular choice in real estate price
prediction.A key advantage of Linear Regression is that it is computationally efficient, making it
suitable for real-time predictions. However, it may not perform as well when the relationships
between features are complex or non-linear.

12
2.Lasso Regression:

Lasso Regression (Least Absolute Shrinkage and Selection Operator) is an extension of Linear
Regression that includes a regularization term to prevent overfitting by shrinking some coefficients
to zero. It is especially useful when dealing with datasets that have a large number of features,
allowing the model to automatically select the most important features for prediction. While Lasso
can improve model performance in some cases, it may not always outperform simpler models like
Linear Regression when the data does not contain multicollinearity or a large number of irrelevant
features.

3.Decision Trees:

Decision Trees are non-linear models that partition the data into subsets based on the most
significant features. They work well with complex datasets and can model non-linear relationships
between features and the target variable. Decision Trees have the advantage of being able to capture
interactions between features, which Linear Regression may miss. However, they are prone to
overfitting, especially when the tree is deep and the dataset is small. To mitigate overfitting,
techniques like pruning or using ensemble methods like Random Forests can be employed.

2.3 Model Evaluation Metrics

The performance of real estate price prediction models is typically evaluated using various metrics,
including:

•R² (Coefficient of Determination): Measures the proportion of variance in the target variable that is
explained by the model. A higher R² value indicates a better fit of the model.

•Mean Squared Error (MSE): Measures the average squared difference between the predicted and
actual prices. Lower MSE values indicate better model performance.

2.4 Comparative Analysis of Models

In previous studies, several models have been compared for their ability to predict real estate
prices. Some studies suggest that Linear Regression performs well when the relationship between
the features and target is linear, while other studies find that more complex models like Decision
Trees or Random Forests provide better performance when the relationships are non-linear or there
are interactions between features.

13
In this project, different models were evaluated for predicting the price of properties in
Bangalore. Linear Regression was initially considered along with other models such as Lasso
Regression and Decision Trees. Among these, Linear Regression provided the highest R² score,
indicating the best model fit and most accurate predictions. As a result, Linear Regression was
selected as the final model for predicting property prices.

2.5 Importance of Location in Real Estate Price Prediction

One of the most crucial factors in real estate price prediction is location. Various studies have
shown that location significantly influences property prices, and properties in prime areas or near
key infrastructure (e.g., business districts, transportation hubs) tend to have higher prices. In this
project, the location feature plays a critical role in predicting the property price. Geospatial data,
including coordinates and local amenities, can further enhance prediction accuracy, allowing the
model to consider location-based price variations.

2.6 Existing Web-based Real Estate Prediction Systems

Several online platforms offer property price prediction tools using machine learning models.
These platforms often integrate features such as location, area, and number of rooms to predict
property prices. However, many of these systems lack user-friendly interfaces and do not provide
real-time, interactive prediction results. This project aims to bridge this gap by developing a Flask-
based web application where users can select their property details and receive accurate predictions
instantly. The interactive map feature, which visually marks the selected location after prediction,
enhances the user experience by providing spatial context for the price predictions.

14
CHAPTER 3
DATA COLLECTION AND PREPROCESSING

3.1 Data Collection:

The first step in developing an effective real estate price prediction system is to gather
relevant data. For this project, data was collected from publicly available real estate platforms that
provide detailed information about properties in Bangalore. The dataset includes the following
features for each property:

•Area: The size of the property in square feet.

•Number of Bedrooms (BHK): The number of bedrooms in the property.

•Location: The geographical location of the property (e.g., neighborhood or area within Bangalore).

•Price: The market price of the property, which is the target variable for prediction.

In addition to these primary features, some supplementary data such as the year of
construction and proximity to key landmarks (like IT parks, schools, etc.) were also considered, but
ultimately, only the core features mentioned above were retained for the model due to their direct
impact on property prices.

3.2 Data Cleaning:

Data cleaning is crucial to ensure the quality of the dataset and the reliability of the predictions. The
following steps were undertaken during data cleaning:

1.Handling Missing Values:

Missing values in the dataset can lead to biased predictions. Properties with missing values
for essential features like area, BHK, or price were removed to maintain the integrity of the dataset.
Any missing values in non-critical fields were imputed using median or mode imputation methods.

2.Removing Duplicates:

Duplicate entries can distort model training, leading to overfitting. Therefore, any duplicate
records in the dataset were identified and removed.

15
3.Outlier Removal:

Outliers can skew model performance, especially for linear regression models. Outliers in
the area and price columns were identified using statistical methods (such as the Z-score) and
removed from the dataset. This was particularly important as certain properties with unusually high
prices or areas could distort the prediction model.

Outlier Removal

16
3.3 Feature Engineering:

Feature engineering involves creating new features from the existing ones to improve model
performance. In this project, the following steps were performed:

1.Encoding Categorical Variables:

The location feature is categorical (i.e., the name of the neighborhood), and machine
learning models typically require numerical input. The location variable was encoded into numerical
format using techniques such as Label Encoding or One-Hot Encoding, depending on the number of
unique locations. For this project, One-Hot Encoding was chosen to create binary columns
representing each unique location, thus enabling the model to treat each location as a distinct feature.

2.Feature Scaling:

Features like area and price may have different scales, which can negatively impact the
performance of models like Linear Regression. Standardization (scaling the features to a mean of 0
and a standard deviation of 1) was performed on the area and price features to ensure that no single
feature dominates the model due to its scale.

3.4 Data Transformation:

The data was transformed to improve the model’s ability to understand patterns:

1.Log Transformation:

For features with highly skewed distributions, such as price, a log transformation was
applied. This transformation made the distribution of the data more normal, which is a common
requirement for many machine learning models, including Linear Regression.

2.Normalization:

Min-Max Scaling was applied to numerical features, such as area, to ensure that they were
on the same scale. This helped improve the model’s ability to converge during training.

17
3.5 Splitting the Data:

The dataset was divided into training and testing sets to evaluate the model’s performance:

•Training Set: 80% of the dataset was used for training the model. This subset was used to fit the
model and learn the relationships between the features and the target variable.

•Testing Set: 20% of the dataset was set aside as the testing set to evaluate the model’s performance
on unseen data. This allowed for a fair assessment of the model’s ability to generalize to new data.

3.6 Summary:
In summary, the data collection and preprocessing steps were crucial to ensure that the dataset
used for training the model was clean, relevant, and suitable for machine learning tasks. The
preprocessing techniques, including handling missing values, removing outliers, and encoding
categorical variables, played a significant role in improving the performance of the model. In the
next chapter, the model building process is discussed, where the preprocessed data is used to train a
predictive model for real estate price prediction.

18
CHAPTER 4
MODEL BUILDING

4.1 Overview of Model Building:


The goal of this chapter is to describe the process of developing the predictive model for real
estate price prediction using the preprocessed data. In this project, several machine learning models
were evaluated to identify the best-performing model. The models considered were Linear
Regression, Lasso Regression, and Decision Trees. After evaluating each model’s performance,
Linear Regression emerged as the most effective model for predicting property prices based on the
features of the dataset.

19
4.2 Model Selection:
Multiple models were tested during the model selection phase:
1.Linear Regression:
Linear Regression is a simple and interpretable machine learning algorithm that models the
relationship between the dependent variable (property price) and one or more independent variables
(property features like area, BHK, and location). The primary advantage of Linear Regression is its
ease of implementation and efficiency, especially when the relationship between the features and the
target variable is linear.
2.Lasso Regression:
Lasso Regression is an extension of Linear Regression that includes regularization to prevent
overfitting by shrinking less important feature coefficients to zero. While Lasso helps with feature
selection and reducing model complexity, its performance was comparable to Linear Regression in
this case, and it did not provide significant improvements.
3.Decision Trees:
Decision Trees are non-linear models that split the data into subsets based on the most
significant feature. They can capture interactions between features and are more flexible than Linear
Regression. However, Decision Trees tend to overfit when trained on smaller datasets, and their
performance was slightly inferior to that of Linear Regression in this particular project.

After evaluating the perfomance of each model using key metrics like R² (coefficient of
determination) and Mean Squared Error (MSE), Linear Regression was chosen due to its ability to
provide accurate and consistent predictions on the test data.

4.3 Model Training:


The Linear Regression model was trained on the preprocessed data, which included the
features: area, number of bedrooms (BHK), and location. The following steps were followed during
the model training process:

1.Fitting the Model:


The training data was used to fit the Linear Regression model. This process involved finding the
optimal coefficients for the linear equation that best represents the relationship between the features
and the target variable (property price).

20
2.Hyperparameter Tuning:
Although Linear Regression does not have many hyperparameters compared to other models,
adjustments were made to ensure optimal performance. The fit_intercept parameter was set to True
to include an intercept term in the regression equation
3.Cross-validation:
To ensure that the model generalizes well to unseen data, cross-validation was applied. This
technique involved splitting the data into several folds and training the model on each fold to
evaluate its performance across different subsets of the data. This helped to minimize the risk of
overfitting and provided a more accurate estimate of the model’s performance.

4.4 Model Evaluation:


The performance of the Linear Regression model was evaluated using several metrics:
1.R² (Coefficient of Determination):
The R² value indicates how well the model fits the data, with values closer to 1 indicating a
better fit. For the final model, the R² score was found to be high, demonstrating that the model could
explain a significant proportion of the variance in the property prices.
2.Mean Squared Error (MSE):
The MSE measures the average squared difference between the predicted and actual prices. A
lower MSE indicates a better model performance. The final model achieved a reasonably low MSE,
indicating that its predictions were close to the actual property prices.
3.Residual Analysis:
The residuals (differences between the predicted and actual prices) were plotted to check for
patterns. Ideally, residuals should be randomly distributed with no obvious patterns, indicating that
the model is appropriate for the data. In this case, the residuals showed a random distribution,
supporting the validity of the Linear Regression model.

4.5 Model Results:


After training and evaluating the Linear Regression model, the results indicated that the model
could predict real estate prices with a reasonable degree of accuracy. The model successfully learned
the relationships between area, BHK, and location, providing predictions that were in close
alignment with actual property prices. The model was saved using Pickle, allowing it to be
loaded into the web application for real-time predictions. This allows users to input the features of a
property, and the model can quickly provide an estimated price based on its learned patterns.

21
4.6 Limitations and Future Improvements:
Although the Linear Regression model performed well, it does have certain limitations:

•The model assumes a linear relationship between features and price, which may not always hold in
real-world scenarios.

•Location as a categorical feature could be expanded to include more granular location data, such as
proximity to key amenities like schools, hospitals, and transportation hubs.

Future improvements could involve experimenting with more advanced models like Random Forests
or Gradient Boosting to capture non-linear relationships and interactions between features.
Additionally, integrating more detailed spatial data (e.g., neighborhood amenities) could further
enhance prediction accuracy.

4.7 Summary:
In summary, Linear Regression was selected as the final model due to its high performance,
simplicity, and ability to make accurate predictions based on the given features. The model was
trained, evaluated, and validated using cross-validation and performance metrics such as R² and
MSE. The trained model was then integrated into the Flask-based web application, where users can
input property details and receive real-time price predictions.

22
CHAPTER 5
WEB APPLICATION DEVELOPMENT
5.1 Overview of the Web Application:
In this chapter, the process of developing the Flask-based web application for real estate
price prediction is described. The web application serves as an interface that allows users to interact
with the trained machine learning model and obtain property price predictions based on the features
they input. It also incorporates an interactive map to display the location after the price prediction.
The application was designed with a focus on simplicity, usability, and visual appeal. The user
interface is designed to be intuitive, ensuring that users with no prior experience in data science can
easily use the tool to predict property prices.

23
5.2 Flask Web Framework:
Flask is a lightweight Python web framework used for building web applications. It is
known for its simplicity and flexibility, making it ideal for projects like this one, where the goal is to
create a web interface for machine learning models. Flask allows for easy integration with Python
code, enabling us to serve the trained Linear Regression model and provide real-time predictions.
The main components of the Flask application include:
1.Flask Server:
The Flask server handles incoming HTTP requests from users and returns responses, which
include the predicted price and the location-based visualization. The server uses routes to link the
front-end interface to the back-end logic (e.g., invoking the prediction model).
2.HTML Templates:
HTML templates define the structure of the web pages and are used to render dynamic
content. In this project, HTML was used to create the input form, display results, and render the
interactive map. The Jinja2 template engine, which comes bundled with Flask, was used to
dynamically generate HTML content based on user input.
3.Static Files:
Static files, such as CSS and JavaScript, are used to style the webpage and enhance
interactivity. Custom styles were applied to create a visually engaging design for the input fields,
buttons, and results display.

5.3 Key Features of the Web Application:


1.User Input for Property Features:
The primary feature of the application is the input form, where users can provide details
about the property, such as:
•Location: The user can select the location of the property from a dropdown menu, which contains a
list of neighborhoods in Bangalore.
•BHK: The number of bedrooms in the property, entered as a text input.
•Bathrooms: The number of bathrooms in the property, entered as a text input.
•Area (sqft): The area of the property in square feet, entered as a text input.
2.Price Prediction:
After the user inputs the property details and selects the location, they can click the “Predict
Price” button. The application uses the trained Linear Regression model to predict the price of the
property based on the input features. The predicted price is displayed on the results page.
24
3.Interactive Map:
After predicting the price, the location of the property is displayed on an interactive map.
The location is marked based on the coordinates stored in the coordinates.json file. This feature
provides a spatial context to the prediction, allowing users to visually locate the property within the
city.
4.Real-time Results:
The application provides real-time results by using the trained machine learning model in the
backend. Once the user submits their input, the server processes the data, invokes the model to make
a prediction, and returns the result to the front end. This ensures that users receive immediate
feedback on the predicted property price.

25
5.4 Integrating the Machine Learning Model
The trained Linear Regression model was saved using Pickle, a Python library used for
serializing Python objects. The Pickle file was loaded into the Flask application to allow for model
inference in real-time. When a user submits the input form, the application:
1.Loads the pre-trained model.
2.Retrieves the user’s input data.
3.Preprocesses the input data (e.g., encoding categorical variables, scaling numerical features).
4.Passes the preprocessed data into the model to get the predicted price.
5.Returns the predicted price to the user.
This integration of machine learning and Flask enables the application to provide dynamic
predictions based on real-time user input.

5.5 User Interface Design:


The user interface (UI) was designed to be simple yet functional, with an emphasis on a
seamless user experience. The following design elements were included:
1.Dropdown for Location Selection:
The location of the property is selected from a dropdown menu, which displays all available
neighborhoods in Bangalore. This menu is dynamically populated using the coordinates stored in the
coordinates.json file. Each location is associated with its corresponding coordinates to facilitate
accurate location marking on the map after prediction.
2.Input Fields:
Three primary input fields were provided:
•BHK: A text input field where the user can enter the number of bedrooms.
•Bathrooms: A text input field where the user can enter the number of bathrooms.
•Area (sqft): A text input field where the user can enter the area of the property in square feet.
3.Predict Button:
The “Predict Price” button triggers the prediction process. Once clicked, it sends the input
data to the Flask server for processing, and the predicted price is displayed on the webpage.
4.Result Display:
After the prediction is made, the predicted property price is displayed clearly on the page,
along with the location shown on an interactive map.

26
5.Interactive Map:

The interactive map is powered by a JavaScript library (e.g., Leaflet.js or Google Maps
API), which renders the map and marks the selected location. This feature enhances the user
experience by allowing users to see the geographic context of the property price prediction.

5.6 Application Flow:


1.The user navigates to the web application and is presented with an input form.

2.The user selects the location from the dropdown menu, enters the area, BHK, and bathrooms, and
clicks the “Predict Price” button.

3.The server receives the input, processes it, and uses the pre-trained model to make a price
prediction.

4.The predicted price is displayed on the webpage, and the location is marked on the interactive
map.

5.7 Deployment and Hosting:


The web application was hosted on a cloud server (such as Heroku or AWS), making it
accessible to users via a public URL. This ensures that users can access the application from any
device with an internet connection and interact with the model for real-time price predictions.

5.8 Summary:
The web application serves as the primary interface for users to input property details,
receive real-time price predictions, and visualize the predicted price on an interactive map. By using
Flask for the backend and integrating the trained Linear Regression model, the application is able to
provide accurate and dynamic predictions. The UI is designed to be user-friendly, with simple input
forms and an interactive map to enhance the user experience.

27
CHAPTER 6
RESULTS AND DISCUSSION

6.1 Overview:
In this chapter, we present the results of the real estate price prediction model and discuss the
performance of the model. This includes evaluating the effectiveness of the machine learning model,
analyzing the user interactions with the web application, and reviewing the overall accuracy and
usefulness of the predicted property prices. The chapter also examines the strengths and limitations
of the application and suggests possible improvements.

6.2 Model Performance Evaluation:


The performance of the machine learning model, specifically the Linear Regression model,
was evaluated using various metrics. The key metric used to assess the model’s performance was R-
squared (R²), which indicates how well the model’s predictions fit the actual data.
1.Training and Testing Split:
The dataset was divided into two parts: a training set and a testing set. The training set was
used to train the Linear Regression model, and the testing set was used to evaluate its performance
on unseen data. This ensures that the model is generalizing well and not just memorizing the data.
2.R-squared (R²):
The R² score for the Linear Regression model was calculated to be [Insert R² score], which
indicates that the model explains approximately 84% of the variance in property prices. This is
considered a good result, suggesting that the model is effective in predicting property prices based
on the input features.
3.Mean Absolute Error (MAE):
The Mean Absolute Error (MAE) was also computed to measure the average magnitude of
the prediction errors in the model. The MAE was found to be 17.98, indicating that the average error
in predicting property prices is relatively low.
4.Mean Squared Error (MSE):
The Mean Squared Error (MSE), which penalizes larger errors more heavily, was also used
to evaluate the model. The MSE was found to be 34.98, which further supports the model’s
relatively high accuracy.
These metrics provide a clear indication that the Linear Regression model is performing well
and producing accurate predictions.

28
6.3 Web Application Performance and User Interaction:
The web application was tested for functionality, speed, and usability. The key aspects of the web
application were evaluated as follows:
1.Input Field Functionality:
The input fields for Location, BHK, Bathrooms, and Area (sqft) worked as expected, with
users able to enter values and select the appropriate location from the dropdown. The form
submission triggered the prediction process and displayed the result.
2.Real-time Prediction:
Once the user submits the data, the Flask server processes the input, invokes the trained
Linear Regression model, and returns the predicted price. The process of receiving a prediction was
fast, with no noticeable lag, providing a seamless user experience.
3.Interactive Map:
After predicting the price, the location was marked on the interactive map. The map
displayed the property’s location correctly, and the coordinates were retrieved from the
coordinates.json file. This feature enhanced the user experience by providing a spatial context to the
price prediction.
4.Error Handling:
The application was tested for robustness by entering invalid inputs, such as non-numeric
values in the BHK, Bathrooms, and Area fields. The application responded with appropriate error
messages, ensuring that users received clear feedback when invalid data was entered.
5.User Feedback:
Initial user testing showed that the interface was user-friendly and that the predictions were
easy to understand. Users appreciated the simplicity of the input fields and the inclusion of the
interactive map to visualize the property location.

6.4 Discussion:
The Linear Regression model performed well in predicting property prices in Bangalore,
with high accuracy as evidenced by the R² score and low error metrics (MAE and MSE). This
suggests that the model is a reliable tool for predicting real estate prices based on location, area,
BHK, and bathroom features.
The integration of the interactive map was a key feature that added value to the application,
as it allowed users to visually locate the property after receiving the prediction. This was particularly
useful in a city like Bangalore, where location plays a critical role in property pricing.
However, there are some limitations to the model and application:

29
1.Feature Limitation:

The model currently relies on only a few features (location, BHK, bathrooms, and area) for
prediction. In reality, factors such as the age of the property, proximity to amenities, and economic
conditions may also influence property prices but are not considered in the current model.

2.Data Quality:

The accuracy of the predictions is also dependent on the quality and representativeness of the
training data. If the dataset used to train the model contains errors or biases, the model’s predictions
may be skewed.

3.Location Granularity:

The application uses predefined locations from the coordinates.json file, which may not
include all possible neighborhoods or areas in Bangalore. The lack of granularity in location data
may limit the applicability of the model in certain cases.

4.Model Improvements:

Future versions of the model could incorporate more features and use more sophisticated
machine learning algorithms, such as Random Forest or Gradient Boosting, to improve accuracy.
Additionally, feature engineering and hyperparameter tuning could further optimize the model’s
performance.

6.5 Summary:
The model showed strong performance in predicting real estate prices with a high R² score
and low error metrics. The web application successfully integrates the trained model, providing a
seamless user experience with real-time price predictions and an interactive map for location
visualization. Although there are some limitations to the model and application, the results indicate
that the system is effective for predicting property prices based on available features. Future
improvements could be made by incorporating additional features, improving the data quality, and
exploring more advanced machine learning techniques.

30
CHAPTER 7
CONCLUSION AND FUTURE ENHANCEMENT

7.1 Conclusion
The Real Estate Price Predictor project successfully demonstrates the practical application of
machine learning techniques in predicting property prices based on user-provided features such as
location, area (in square feet), number of bedrooms (BHK), and number of bathrooms.

The project covered all the major stages of a typical data science workflow, including:

•Data Collection: Gathering real estate data specific to Bangalore city.

•Data Cleaning: Removing inconsistencies, handling missing values, and formatting the dataset.

•Feature Engineering: Transforming raw data into meaningful features for model training.

•Outlier Removal: Identifying and eliminating anomalous data points to improve model accuracy.

•Model Building: Experimenting with multiple algorithms such as Linear Regression, Lasso
Regression, and Decision Tree Regression, and finalizing Linear Regression based on performance
metrics.

•Deployment: Developing a Flask-based web application that allows users to input property details
and receive price predictions instantly.

The web application also incorporates an interactive map feature, where, after predicting the
price, the selected location is dynamically marked based on coordinates data.

The model achieved good performance metrics, indicating a strong fit to the data and
reliable price predictions within the context of the given features.

Overall, the project meets its objective of creating a working prototype for property price
prediction, providing both technical learning and practical insights into building real-world machine
learning applications.

31
7.2 Future Enhancements
While the project achieves its current goals, there are several areas where future
enhancements can be made:

• Incorporate More Features:

Include additional influential factors like the age of the property, proximity to metro stations,
nearby schools, hospitals, and market areas, to improve prediction accuracy.

• Use Advanced Machine Learning Models:

Explore more sophisticated models such as Random Forest Regressor, Gradient Boosting
Regressor, or even Neural Networks for better performance and capturing complex
relationships between features.

• Dynamic Location Updates:

Instead of a fixed set of locations, use dynamic fetching from real-time databases or APIs to
keep location options updated.

• Improve UI/UX:

Enhance the front-end design further by adding features like form validation, prediction
history, and better error handling for invalid inputs.

• Deploy to a Robust Hosting Platform:

Deploy the application on a full-scale production server such as AWS, Azure, or GCP for
handling larger user traffic and better performance.

• Mobile Compatibility:

Create a responsive design or a mobile application version to make it accessible on


smartphones.

• Security Improvements:

Add security layers to protect the web application and the model from potential attacks and
misuse.

32
REFERENCES

1. Python Official Documentation (pandas, numpy, sklearn, matplotlib, flask)


2. Scikit-learn Machine Learning Library
3. Flask Web Framework Documentation
4. Various academic and industry articles on real estate price prediction and machine
learning models

33

You might also like