Latexcode
Latexcode
Bachelor of Technology
in
Computer Science & Engineering
By
November, 2024
HEART DISEASE PREDICTION
Bachelor of Technology
in
Computer Science & Engineering
By
November, 2024
CERTIFICATE
It is certified that the work contained in the project report titled ”HEART DISEASE PREDICTION”
by ”T.TARUNN TEZAA (22UECM2022), G.NIKHIL KUMAR (22UECT2005), K.UJWAL (22UECM2003)”
has been carried out under my supervision and that this work has not been submitted elsewhere for a
degree.
Signature of Supervisor
Mrs.A.SATHYA
B.E,M.E.
Computer Science & Engineering
School of Computing
Vel Tech Rangarajan Dr. Sagunthala R&D
Institute of Science & Technology
November, 2024
i
DECLARATION
We declare that this written submission represents my ideas in our own words and where others’
ideas or words have been included, we have adequately cited and referenced the original sources. We
also declare that we have adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in our submission. We understand
that any violation of the above will be cause for disciplinary action by the Institute and can also
evoke penal action from the sources which have thus not been properly cited or from whom proper
permission has not been taken when needed.
(Signature)
T.TARUNN TEZAA
Date: / /
(Signature)
G.NIKHIL KUMAR
Date: / /
(Signature)
K.UJWAL
Date: / /
ii
APPROVAL SHEET
This project report entitled HEART DISEASE PREDICTION by T.TARUNN TEZAA (22UECM2022),
G.NIKHIL KUMAR (22UECT2005), K.UJWAL (22UECM2003) is approved for the degree of B.Tech
in Computer Science & Engineering with specilazation of AIML
Examiners Supervisor
Ms.A.SATHYA, B.E,M.E.,
Date: / /
Place:
iii
ACKNOWLEDGEMENT
We express our deepest gratitude to our Honorable Founder Chancellor and President Col.
Prof. Dr. R. RANGARAJAN B.E. (Electrical), B.E. (Mechanical), M.S (Automobile), D.Sc., and
Foundress President Dr. R. SAGUNTHALA RANGARAJAN M.B.B.S. Vel Tech Rangarajan Dr.
Sagunthala R&D Institute of Science and Technology, for her blessings.
We express our sincere thanks to our respected Chairperson and Managing Trustee Mrs. RAN-
GARAJAN MAHALAKSHMI KISHORE,B.E., Vel Tech Rangarajan Dr. Sagunthala R&D
Institute of Science and Technology, for her blessings.
We are very much grateful to our beloved Vice Chancellor Prof. Dr.RAJAT GUPTA, for provid-
ing us with an environment to complete our project successfully.
We record indebtedness to our Professor & Dean, Department of Computer Science & Engi-
neering, School of Computing, Dr. S P. CHOKKALINGAM, M.Tech., Ph.D., & Associate Dean,
Dr. V. DHILIP KUMAR,M.E.,Ph.D., for immense care and encouragement towards us throughout
the course of this project.
We are thankful to our Professor & Head, Department of Computer Science & Engineering,
Dr. N. VIJAYARAJ, M.E., Ph.D., and Associate Professor & Assistant Head, Dr. M. S. MURALI
DHAR, M.E., Ph.D.,for providing immense support in all our endeavors.
We also take this opportunity to express a deep sense of gratitude to our Internal Mrs.A.SATHYA
B.E,M.E., for her cordial support, valuable information and guidance,she helped us in completing
this project through various stages.
We thank our department faculty, supporting staff and friends for their help and guidance to com-
plete this project.
iv
ABSTRACT
Nowadays, prediction of Heart Disease has become one amongst the most chal-
lenging mission in medical sector. Heart is the most essential or crucial portion of
our body. Heart is used to maintain and conjugate blood in our body. There are a lot
of cases in the world related to heart diseases. In the present world, per every minute
proximately one person dies because of heart disease. As prediction of heart disease
is a complicated task, there is a requirement to computerize the foresight process to
bypass pitfalls interrelated with it and forewarn the patient beforehand. The building
of the model has made use of machine learning algorithms like random forest, K-
nearest neighbor, logistic regression, and decision tree. The study demonstrates that,
when compared to other ML techniques, logistic regression and KNN provide better
prediction accuracy in a shorter amount of time. The heart disease prediction GUI
allows the user to enter the values such as age, gender, cholesterol and the result is
displayed on the page after submitting the values.
Keywords:
- Heart Disease Prediction
- Machine Learning in Healthcare
- Cardiovascular Risk Assessment
- Risk Factors for Heart Disease
- Artificial Intelligence (AI) in Cardiology
- Predictive Analytics
- Logistic Regression
- Classification Algorithms
- Deep Learning for Health Prediction
- Decision Trees in Medical Diagnosis
- Support Vector Machine (SVM)
- Random Forest for Disease Prediction
- Neural Networks in Medicine
- Healthcare Data Analysis
- UCI Heart Disease Dataset
- Framingham Risk Score
- Medical Feature Engineering
- Healthcare Data Integration
- Explainable AI (XAI) in Healthcare
v
LIST OF FIGURES
6.1 Output 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2 Output 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
vi
LIST OF TABLES
vii
LIST OF ACRONYMS AND
ABBREVIATIONS
viii
TABLE OF CONTENTS
Page.No
ABSTRACT v
LIST OF FIGURES vi
1 INTRODUCTION 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aim of the project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Project Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 Scope of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 LITERATURE REVIEW 1
2.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.2 Gap Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
3 PROJECT DESCRIPTION 3
3.1 Existing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3 System Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.3.1 Hardware Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.3.2 Software Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.3.3 Standards and Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4 METHODOLOGY 6
4.1 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 General Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.3 Design Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.3.1 Data Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.3.2 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3.3 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3.4 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.3.5 Collaboration diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3.6 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.4 Algorithm & Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.4.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.4.3 Data Set / Generation of Data (Description only) . . . . . . . . . . . . . . . 14
4.5 Module Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.5.1 Module1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.5.2 Module2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.5.3 Module3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
8 PLAGIARISM REPORT 25
Appendices 26
References 28
Chapter 1
INTRODUCTION
1.1 Introduction
Heart Disease Prediction System, where innovation meets health to tackle the rising tide of
cardiovascular diseases. In response to the critical need for proactive healthcare solutions, our
project combines cutting-edge technology with a focus on predictive analytics. By analyzing diverse
datasets, from medical histories to lifestyle factors, we provide personalized risk assessments aimed
at early detection and prevention. Our user-centric approach ensures that complex health insights
become accessible to all, fostering a proactive mindset towards heart health. .
The integration of sophisticated machine learning algorithms and wearable technology en-
hances the precision and adaptability of our predictive models. Beyond individual well-being, we
aim to collaborate with healthcare professionals and researchers, contributing to a collective effort to
advance preventive medicine. Join us in reshaping the narrative of cardiovascular health, one predic-
tion, and one healthier life, at a time.
1
can empower both doctors and patients to make informed, data-driven decisions that improve health
outcomes and reduce risks associated with heart disease.
2
Chapter 2
LITERATURE REVIEW
• Support Vector Machines (SVM): Research has shown that SVM can effectively classify pa-
tients based on complex feature sets, outperforming traditional methods in terms of accuracy.
• Random Forests: This ensemble method has gained popularity due to its ability to handle large
datasets and its robustness against overfitting. Studies report improved accuracy and feature
importance insights, making it a favorable choice for heart disease prediction.
• Neural Networks: Deep learning approaches have emerged as powerful tools in predictive mod-
eling. Research indicates that artificial neural networks can learn complex patterns from vast
datasets, yielding high accuracy in diagnosing heart disease.
3. Hybrid Models
Recent works have investigated hybrid models that combine multiple machine learning algorithms
to improve prediction performance. For example, studies have integrated decision trees with neural
networks, leveraging the strengths of both approaches to achieve superior results in heart disease
prediction.
1
model performance. Research has demonstrated that well-processed input data significantly impacts
the accuracy of prediction models.
1
5. Preventive Recommendations - Generic Advice: Current systems may provide broad recom-
mendations without personalization based on individual risk factors. Enhancing the recommendation
engine to deliver tailored health tips could improve user engagement and effectiveness.
6. Integration with Healthcare Systems - Lack of Integration: The prediction system may operate
in isolation, lacking integration with electronic health records (EHR) or other healthcare management
systems. Integration can facilitate seamless access to patient data and improve care coordination.
7. Continuous Learning and Updates - Static Model: Existing systems may not adapt to new
data or trends over time. Implementing a continuous learning framework can ensure the model stays
current with emerging research and changing health patterns.
2
Chapter 3
PROJECT DESCRIPTION
3
understand the outcomes.
This project aims to develop a robust heart disease prediction system that leverages advanced
machine learning techniques to analyze diverse and comprehensive health data. The system will focus
on providing accurate predictions, personalized recommendations, and an intuitive user interface to
enhance accessibility and user engagement. By addressing the gaps in current methodologies and
integrating innovative approaches, this system seeks to improve early detection, reduce the incidence
of heart disease, and ultimately contribute to better health outcomes for individuals at risk.
Advantages of Proposed system
1. Enhanced Accuracy: - Utilizes advanced machine learning algorithms to analyze diverse datasets,
leading to more accurate predictions of heart disease risk compared to traditional methods.
2. Personalized Insights: - Provides tailored health recommendations based on individual risk
factors, lifestyle, and medical history, enhancing user engagement and encouraging proactive health
management.
3. Improved Data Handling: - Incorporates comprehensive data preprocessing techniques to han-
dle missing values, outliers, and normalization, resulting in cleaner and more reliable input for the
predictive model.
4. User-Friendly Interface: - Features an intuitive and accessible user interface that allows users,
regardless of technical expertise, to easily input data and understand results.
4
Standard Used: ISO/IEC 27001
Jupyter
It’s like an open source web application that allows us to share and create the documents which con-
tains the live code, equations, visualizations and narrative text. It can be used for data cleaning and
transformation, numerical simulation, statistical modeling, data visualization, machine learning.
Standard Used: ISO/IEC 27001
5
Chapter 4
METHODOLOGY
Initially the patient registers by providing certain parameters. That registered data is collected in a
database by using machine learning techniques like data collection techniques and when he went to
check about his health condition the collected values or data that has been stored in the database is
been extracted by using some feature extraction techniques. When data is extracted, it under goes
certain processes and therefore finally a disease is predicted and a report is generated. This is the
overview of the heart disease prediction system using machine learning techniques.
6
4.3 Design Phase
This is the initial idea for the flow of the data. The data has to be flown from user to server and
from server to the user for the prediction of the disease by entering details and sending the data.
Communication is done between user and the server.
7
4.3.2 Use Case Diagram
The steps from registering the user i.e., beginning step to the final generating of the report can all be explained
easily by using or easily be represented by using use case diagram where actors are used as users. Users register by
using certain parameters and then they login to their accounts and enter their health conditions and values which
were to be stored in the database i.e., data collection is taken into consideration i.e., the data from the users need
to be collected in the database. Whenever it’s needed the data is to be extracted and then it need to be match with
the values and check for the disease and predict the disease and finally a report need to be generated.
8
4.3.3 Class Diagram
A class diagram is a type of UML diagram that illustrates the static structure of a system by representing its classes,
attributes, methods, and relationships. It provides a high-level overview of the system’s architecture, showing how
different components interact. Attributes define the properties or data members of a class, while methods specify
the operations or functions that a class can perform. The diagram also includes relationships such as associations,
inheritance, aggregation, and composition to depict how classes are connected. Class diagrams are widely used
during the design phase of software development to help developers and designers understand the system’s struc-
ture. They serve as a blueprint for implementation, guiding the coding and testing process. Additionally, these
diagrams are essential for creating well-structured, scalable, and maintainable systems.
9
4.3.4 Sequence Diagram
The data provided by the user flows to the server, where it undergoes a process of matching with the
existing data sets, often referred to as training data. This involves comparing the user input with the
stored data to identify patterns or similarities. During this process, the system calculates probabilities
by analyzing and evaluating the values between the user data and the training data. The goal is
to determine how closely the input matches the pre-existing data in the system. This probabilistic
comparison helps in making predictions or classifications based on the input. Once the matching and
probability computations are complete, the system generates a detailed report. This report typically
summarizes the results of the comparison and highlights relevant insights. Such a mechanism is
commonly used in machine learning, recommendation systems, or decision-support applications to
provide accurate and meaningful outputs.
10
4.3.5 Collaboration diagram
The collaboration involves several key components working together to provide users with health pre-
dictions and recommendations. The User interacts with the User Interface (UI) by entering health-
related data, such as age and cholesterol levels. This data is then sent to the Flask Application Server,
which manages the overall data flow. The server forwards the information to the Data Preprocessing
Module, responsible for cleaning and preparing the data for analysis. Once processed, the data is sent
to the Machine Learning Model, which predicts the likelihood of heart disease based on the input
features. After receiving the prediction, the Flask server communicates with the Recommendation
Engine, which generates personalized health tips based on the model’s output.
11
4.3.6 Activity Diagram
The above activity diagram involves a user as the primary actor. The user begins by selecting the doc-
ument set or dataset on which the classification process will be carried out. After selecting the dataset,
the user can proceed to build a classification model based on the loaded data. Once the classification
model is constructed, the user can input new patient details, such as symptoms, into the predictor
frame. After populating the predictor with the necessary details, the user can retrieve the status of
heart disease, whether it is likely or not. This process provides a systematic flow for performing heart
disease prediction using machine learning techniques. Additionally, the system ensures that the user
can iteratively refine the dataset or model for improved accuracy. The predictor frame acts as an
interactive interface for analyzing new patient inputs. The classification model leverages the trained
dataset to provide results based on probabilities and patterns. This entire flow aims to make heart
disease prediction efficient, user-friendly, and reliable.
12
4.4 Algorithm & Pseudo Code
4.4.1 Algorithm
1. Set Up: Import required libraries (Flask, numpy, pickle) and initialize the Flask app.
2. Load Model: Load the pre-trained model (lr.pkl) using pickle.
3. Home Route (/): Display the main page (HOMEhtml.html) with a form for user input.
4. Prediction Route (/predict):
- Collect and convert form input to a NumPy array.
- Predict the likelihood of heart disease using the model.
- Display the result:
- 0: Show ”Low likelihood of heart disease.”
- 1: Show ”High likelihood of heart disease” and health tips.
5. Run App: Start the Flask app with debugging enabled (debug=True).
@app.route(”/”,methods = [”GET”,”POST”])
def Home():
return render template(”HOMEhtml.html”)
13
4.4.3 Data Set / Generation of Data (Description only)
The dataset for a heart disease prediction model typically includes medical and lifestyle-related
features known to influence heart health. Common datasets, such as the *Cleveland Heart Disease
Dataset* from the UCI Machine Learning Repository, contain records of patients with both numerical
and categorical data on various risk factors. 1. Demographic Information:
- Age: Age of the individual, as heart disease risk generally increases with age.
- Gender: Male or female, as gender can influence risk levels differently.
2. Medical Metrics:
- Blood Pressure: Resting blood pressure (in mm Hg), as high blood pressure is a known risk factor.
- Cholesterol Levels: Serum cholesterol (mg/dl), since high cholesterol can lead to artery blockage.
- Resting Electrocardiographic Results: ECG results to detect abnormalities in heart function.
- Fasting Blood Sugar: Blood sugar levels after fasting, to indicate potential diabetes risks.
3. Lifestyle-Related Factors:
- Exercise-Induced Angina: Indicates if chest pain occurs during physical activity.
- Physical Activity Levels: Captures general activity level, which influences heart health.
4. Target Variable:
- Presence of Heart Disease: The outcome (often 0 or 1), indicating the presence or absence of heart
disease.
4.5.1 Module1
Data Collection Module
• Purpose: Gather data from various sources like medical records, patient questionnaires, or real-
time health monitors.
• Components:
– Input Interfaces: Forms for patient input, integration with electronic health records (EHR),
wearables, or health monitoring devices.
– Data Validation: Ensures data accuracy and completeness (e.g., checking for missing val-
ues or outliers).
• Types of Data:
– Demographic Data: Age, gender, ethnicity.
– Medical History: Previous conditions, family history of heart disease.
– Clinical Data: Blood pressure, cholesterol levels, ECG results.
– Lifestyle Data: Smoking, alcohol use, physical activity, diet.
14
4.5.2 Module2
Data Preprocessing Module
• Purpose: Prepare raw data for analysis and prediction by cleaning and transforming it into a
usable format.
• Components:
– Data Cleaning: Handling missing values, noise, and inconsistencies.
– Feature Engineering: Extracting relevant features (e.g., creating a ”risk factor” score).
– Normalization/Standardization: Scaling data (important for models like SVM, neural net-
works).
– Encoding Categorical Data: Convert categorical variables into numerical values using
techniques like one-hot encoding or label encoding.
4.5.3 Module3
Feature Selection Module
• Purpose: Identify and select the most relevant features that contribute to heart disease prediction,
improving model accuracy.
• Components:
• Correlation Analysis: Identify which features are strongly correlated with heart disease.
• Dimensionality Reduction: Techniques like PCA (Principal Component Analysis) or LDA
(Linear Discriminant Analysis) to reduce the number of features.
• Feature Ranking: Use statistical methods or machine learning techniques (e.g., Recursive Fea-
ture Elimination) to rank features based on importance.
15
Chapter 5
16
5.1.2 Output Design
17
5.2 Testing
18
5.3.4 Test Result
19
Chapter 6
The proposed system heart disease prediction system encompasses several key features:
1. Comprehensive Data Integration: Gather and integrate diverse health data, including demo-
graphics, medical history, and physiological parameters, for a holistic understanding.
2. Advanced Machine Learning Algorithms: Implement state-of-the-art machine learning algo-
rithms to analyze integrated data, identifying intricate patterns and correlations for accurate predic-
tions.
3. User-Friendly Interface: Develop an intuitive interface for healthcare professionals, ensuring
efficient input and access to patient data.
4. Personalized Risk Assessment: Provide personalized risk assessments, considering individual
factors like genetics, lifestyle, and medical history.
5. Early Warning System: Integrate an early warning system to detect potential heart disease
risks, facilitating timely interventions.
6. Security Measures: Implement robust security measures to safeguard sensitive health informa-
tion and ensure compliance with privacy regulations.
7. Scalability: Design the system to be scalable, accommodating growing datasets and user vol-
umes as the project expands.
8. Regular Updates and Maintenance: Establish a system for continuous updates and mainte-
nance to keep the predictive model aligned with evolving medical knowledge.
9. Educational Resources: Include educational resources for healthcare professionals and indi-
viduals, fostering awareness and understanding of cardiovascular health.
This proposed system aims to be a comprehensive, user-friendly, and dynamic tool, utilizing ad-
vanced technologies to enhance heart disease prediction, prevention, and management.
20
In heart disease prediction, existing systems often rely on traditional machine learning models
or rule-based algorithms to determine the likelihood of heart disease based on historical medical
data. These systems are generally used in hospitals, clinics, or research settings to assist in decision-
making. Here’s an overview of key characteristics:
1. Data Sources and Features: - Existing systems typically utilize structured datasets like the
Cleveland Heart Disease Dataset or other similar medical databases. - Common features include
age, gender, cholesterol level, blood pressure, heart rate, and some basic lifestyle factors. - Data
preprocessing may be minimal, with limited handling of missing values, outliers, or normalization
techniques, which can reduce prediction accuracy.
2. Algorithms and Models: - Many traditional systems use simple models such as Logistic Re-
gression, Decision Trees, or Naive Bayes. These models are computationally inexpensive but often
lack the sophistication to capture complex, non-linear relationships within the data. - Some systems
may use statistical analysis rather than machine learning, which can be less adaptable to new patterns
and may generalize poorly on new patient data.
3. Prediction Accuracy: - These systems usually achieve moderate accuracy. Due to simpler
models and basic feature engineering, they may fail to capture intricate dependencies and interactions
between variables, resulting in limited predictive power. - Accuracy often stagnates around 70-80
4. Interpretability and Insights: - Most existing models provide only basic interpretability. For
instance, logistic regression might highlight some feature importance, but decision trees and simpler
models lack nuanced explanations. - Insights are often restricted to general risk levels (high or low)
without detailed recommendations or tailored insights for individual risk factors.
5. User Interface and Accessibility: - Existing systems are often limited to hospital or clinical en-
vironments and may not have user-friendly interfaces. The use of specialized software or dashboards
may require medical expertise to operate, reducing accessibility for non-expert users or patients di-
rectly. - These systems are generally not interactive or accessible outside medical facilities, which
limits their use for preventive care and early self-assessment.
6. Preventive Recommendations: - Most traditional systems simply provide a risk score without
offering personalized health recommendations or actionable steps to reduce heart disease risk. -
As a result, these systems serve as diagnostic aids but lack the capability to empower patients with
preventive strategies.
21
1 i m p o r t numpy a s np
2 from f l a s k i m p o r t F l a s k , r e q u e s t , j s o n i f y , r e n d e r t e m p l a t e
3 import pickle
4 from s k l e a r n . p r e p r o c e s s i n g i m p o r t MinMaxScaler
5 s c a l e r = MinMaxScaler ( )
6 # C r e a t e f l a s k app
7 app = F l a s k ( name )
8 model = p i c k l e . l o a d ( open ( ” l r . p k l ” , ” r b ” ) )
9
Output
22
Figure 6.2: Output 2
23
Chapter 7
7.1 Conclusion
After implementing a machine learning approach for training and testing, we found that the accuracy
of the Logistic Regression is significantly more effective than other methods. Each algorithm’s con-
fusion matrix, error metrics, and accuracy score are used to evaluate performance. We achieved a
93.55% accuracy using logistic regression using data that was taken from the UCI repository .KNN
likewise predicts well, with an accuracy of 93.01%.In the future, heart disease prediction could be
improved by incorporating more data sources such as genetics, lifestyle, and environmental factors.
Machine learning algorithms could be used to identify patterns in the data and predict the risk of
heart disease. Additionally, artificial intelligence (AI) could be used to better understand the complex
relationships between different risk factors and their impact on heart health. AI could also be used to
develop personalized treatments for individuals based on their individual risk profiles.
Future enhancements for a heart disease prediction project could significantly improve its accuracy,
usability, and impact. First, improving model accuracy through feature engineering, such as incor-
porating more health-related factors (like cholesterol trends, lifestyle choices, and genetic informa-
tion), could make predictions more robust. Implementing advanced algorithms, such as XGBoost or
deep learning, or using ensemble methods could further boost accuracy by combining the strengths
of different models. For interpretability, techniques like SHAP values or LIME would help users
understand how individual features impact predictions, and decision trees could provide a more un-
derstandable flow of the model’s reasoning. Additionally, real-time data integration, especially with
wearable devices, would allow continuous health monitoring, and building data pipelines for live data
would enable dynamic predictions. Enhancing usability with a user-friendly web or mobile interface
would make it easy for individuals to input health data and receive instant feedback.
24
Chapter 8
PLAGIARISM REPORT
25
Appendices
26
Appendix A
import numpy as np
from flask import Flask, request, jsonify, rendertemplate
import pickle
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
Create flask app
app = Flask(name)
model = pickle.load(open(”lr.pkl”, ”rb”))
@app.route(”/”,methods = [”GET”,”POST”])
def Home():
return render template(”HOMEhtml.html”)
27
28
References
[1] For references on heart disease prediction, here are some key sources, studies, and resources:
1. Machine Learning in Heart Disease Prediction Chen, X., Lin, X. (2019). ”Machine Learn-
ing Techniques for Heart Disease Prediction: A Survey.” International Journal of Healthcare In-
formation Systems and Informatics, 14(1), 1-19. This paper surveys various machine learning
techniques used in heart disease prediction.
2. Risk Factors and Models for Cardiovascular Disease Yusuf, S., Hawken, S., Ounpuu, S., et al.
(2004). ”Effect of potentially modifiable risk factors associated with myocardial infarction in 52
countries (the INTERHEART study): case-control study.” The Lancet, 364(9438), 937–952. This
large-scale study identifies key risk factors for heart disease, serving as a foundation for predictive
models.
3. Framingham Heart Study D’Agostino, R. B., Vasan, R. S., Pencina, M. J., et al. (2008). ”Gen-
eral cardiovascular risk profile for use in primary care: the Framingham Heart Study.” Circulation,
117(6), 743-753. The Framingham study provides a widely used risk score model for predicting
heart disease based on longitudinal data.
4. Heart Disease Data Set (UCI Machine Learning Repository) UCI Ma-
chine Learning Repository. (1988). Heart Disease Data Set. Available at
(https://archive.ics.uci.edu/ml/datasets/Heart+Disease). This dataset is commonly used in
machine learning projects for heart disease prediction.
5. Use of Artificial Intelligence in Cardiovascular Risk Prediction Esteva, A., Robicquet, A.,
Ramsundar, B., et al. (2019). ”A guide to deep learning in healthcare.” Nature Medicine, 25(1),
24-29. This article discusses the application of AI, including deep learning, in healthcare, with a
focus on predictive modeling for cardiovascular conditions.
29
General Instructions
• Cover Page should be printed as per the color template and the next page also should be printed
in color as per the template
• Literature review should be properly cited and described with content related to project
• All the diagrams should be properly described and dont include general information of any
diagram
• Test cases should be written with test input and test output
• Strictly dont change font style or font size of the template, and dont customize the latex
code of report
30