Minor Project 2
Minor Project 2
ON
PREDICTION OF COVID-19 AND PNUEMONIA
DISEASE USING ML TECHNIQUES
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE & ENGINEERING
Submitted By
I/We hereby certify that the work that is being presented in the project report entitled
Prediction of Covid-19 and Pneumonia Diseases using ML techniques to the partial
fulfillment of the requirements for the award of the degree of Bachelor of Computer
Science & Engineering from Dr. Akhilesh Das Gupta Institute of Technology &
Management, New Delhi. This is an authentic record of our own work carried out
under the guidance of Ms. Pratibha Dabas, Assistant Professor in CSE Department.
The matter presented in this project has not been submitted by us for the award of any
other degree elsewhere.
This is to certify that the above statement made by the candidate is correct to the best
of my knowledge. He/She/They are permitted to appear in the Minor Project External
Examination.
Dr. Rakesh Kumar Arora / Ms. Megha Gupta Prof. (Dr.) Ankit Verma
Project Coordinators Head, CSE
i
ACKNOWLEDGEMENT
I/We would like to acknowledge the contributions of the following persons, without
whose help and guidance this report would not have been completed.
I/We acknowledge the counsel and support of our project guide Ms. Pratibha Dabas,
Assistant Professor, CSE Department, with respect and gratitude, whose expertise,
guidance, support, encouragement, and enthusiasm has made this report possible. Their
feedback vastly improved the quality of this report and provided an enthralling
experience. I/We are indeed proud and fortunate to be supervised by him.
We are thankful to, Prof. (Dr.) Ankit Verma, H.O.D. CSE Department, Dr.
Akhilesh Das Gupta Institute of Technology & Management, New Delhi for his
constant encouragement, valuable suggestions and moral support and blessings.
I/We are immensely thankful to our esteemed, Prof. (Dr.) Sanjay Kumar, Director,
Dr. Akhilesh Das Gupta Institute of Technology & Management, New Delhi for
his never-ending motivation and support.
I/We shall ever remain indebted to Dr. Rakesh Kumar Arora and Ms. Megha Gupta,
Project Coordinators, CSE Department and faculty and staff members of Dr.
Akhilesh Das Gupta Institute of Technology & Management, New Delhi.
Finally, yet importantly, I/We would like to express our heartfelt thanks to God, our
beloved parents for their blessings, our friends/classmates for their help and wishes for
the successful completion of this project.
ii
ABSTRACT
This minor project report explores the integration of machine learning techniques for
the prediction of COVID-19 and pneumonia. With the ongoing global health crisis,
there is an urgent need for advanced diagnostic tools, and this project seeks to address
this demand through the synergy of data science and healthcare.
The project's foundation lies in a comprehensive dataset encompassing clinical records
and medical imaging data related to COVID-19 and pneumonia cases. By harnessing
the power of machine learning algorithms, the project aims to develop a predictive
model capable of discerning subtle patterns indicative of these respiratory conditions.
The dataset serves as a rich source of information, allowing the model to learn and
generalize from diverse cases.
The methodology involves the implementation of various machine learning techniques,
ranging from classical algorithms to more intricate deep learning models. Through
extensive training and optimization, the model will be fine-tuned to recognize nuanced
patterns, ultimately contributing to enhanced diagnostic accuracy. Rigorous validation
procedures will be employed to ensure the model's reliability across diverse datasets.
Recognizing the interdisciplinary nature of this challenge, collaboration with healthcare
professionals is integral. This partnership ensures that the developed model aligns with
clinical nuances and practical considerations. The insights gained from medical experts
contribute to the robustness of the predictive model, bridging the gap between
technological advancements and real-world healthcare scenarios.
The project's significance extends beyond academic exploration, aiming to provide
practical solutions for healthcare practitioners. A successful predictive model holds the
potential to revolutionize the diagnostic landscape, offering timely and accurate
insights for clinical decision-making. The tool developed in this project becomes a
valuable asset in the hands of healthcare professionals, augmenting their capabilities in
combating respiratory diseases.
In conclusion, this minor project report serves as a stepping stone into the convergence
of machine learning and healthcare. By focusing on the prediction of COVID-19 and
pneumonia, the project aligns with the global imperative to strengthen diagnostic
capabilities. The fusion of machine learning expertise and medical insights aims to
contribute to the ongoing efforts to mitigate the impact of respiratory diseases, marking
a significant stride towards more efficient and informed healthcare practices.
iii
TABLE OF CONTENTS
Certificate i
Acknowledgement ii
Abstract iii
Table of Contents iv
List of Figures v
4.1. Merits
4.2. Demerits
4.3. Applications
REFERENCES
APPENDICES
iv
List of Figures
2 Home Page 12
3 Covid-19 Prediction 13
4 Pneumonia Prediction 14
v
CHAPTER 1: INTRODUCTION AND LITERATURE
REVIEW
1.1 INTRODUCTION
The global healthcare landscape has been profoundly influenced by the COVID-19
pandemic, underscoring the critical need for advanced diagnostic tools to combat
respiratory diseases effectively. In this context, the integration of machine learning
techniques presents a promising avenue for enhancing predictive capabilities. This
minor project report embarks on a journey into the realm of data science and healthcare,
focusing on the prediction of COVID-19 and pneumonia through the application of
advanced machine learning methodologies.
The significance of accurate and timely diagnosis in respiratory diseases cannot be
overstated. The rapid spread of COVID-19 has exposed the vulnerabilities of healthcare
systems worldwide, emphasizing the imperative for innovative approaches to disease
detection. Pneumonia, a common respiratory ailment, shares symptoms with COVID-
19, further complicating the diagnostic process. This project aims to contribute to the
ongoing efforts to address these challenges by harnessing the power of machine
learning.
At its core, the project hinges on a comprehensive dataset that amalgamates clinical
information and medical imaging data associated with COVID-19 and pneumonia
cases. This dataset serves as the bedrock for training machine learning models to
recognize intricate patterns and correlations within the data. By employing diverse
algorithms, ranging from traditional methods to more sophisticated deep learning
approaches, the project aspires to develop a predictive model that can navigate the
complexities of respiratory disease diagnosis.
The choice of machine learning techniques is grounded in their capacity to analyze vast
datasets and identify subtle yet crucial patterns that may elude human observation.
Traditional diagnostic methods, while valuable, may fall short in handling the
intricacies presented by diseases with overlapping symptoms, such as COVID-19 and
pneumonia. Machine learning, with its ability to learn from data and adapt to evolving
patterns, provides a complementary approach that holds the promise of improved
accuracy and efficiency in diagnosis.
1
The methodology involves a systematic approach to model development and validation.
The machine learning algorithms will undergo rigorous training on the dataset, fine-
tuning their parameters to optimize performance. Validation processes will then assess
the model's reliability and generalizability across diverse datasets, ensuring its
applicability beyond the training context.
Recognizing the interdisciplinary nature of this endeavor, collaboration with medical
professionals becomes paramount. Input from healthcare experts not only enriches the
project with clinical insights but also ensures that the developed model aligns with real-
world medical practices. This collaboration bridges the gap between technology and
healthcare, fostering a symbiotic relationship that enhances the applicability and
relevance of the predictive model.
The anticipated outcomes of this project extend beyond academic exploration. A
successful predictive model could revolutionize the diagnostic landscape, offering
healthcare professionals a tool to expedite and refine their decision-making processes.
In the face of a global health crisis, such advancements become invaluable, contributing
to more effective patient management and ultimately saving lives.
In summary, this project sets out to explore the intersection of machine learning and
respiratory disease diagnosis. By focusing on predicting COVID-19 and pneumonia, it
aligns with the urgent need for innovative diagnostic solutions. Through this
exploration, the project aspires to contribute to the ongoing global efforts to strengthen
healthcare systems and mitigate the impact of respiratory diseases on a global scale.
The convergence of machine learning and healthcare has witnessed a surge in research
endeavors focused on predicting respiratory diseases, particularly COVID-19 and
pneumonia. This literature overview navigates through seminal studies, methodological
advancements, and overarching trends, illuminating the multifaceted landscape of
machine learning applications in the domain.
2
(2018) pioneered the application of convolutional neural networks (CNNs) to chest X-
rays, demonstrating the potential of deep learning in automating pneumonia detection.
However, these early endeavors primarily focused on individual diseases and laid the
groundwork for subsequent research.
As respiratory diseases often share common symptoms, distinguishing between them
poses a diagnostic challenge. Islam et al. (2019) extended the application of machine
learning to distinguish between bacterial and viral pneumonia, showcasing the potential
of computational tools in refining diagnostic specificity. These initial studies
underscored the need for more sophisticated approaches to address the nuances of
respiratory disease diagnosis.
3
pivotal for addressing the diagnostic challenges posed by diseases such as COVID-19
and pneumonia, where symptoms may overlap.
4
This ethical dimension becomes crucial as machine learning models transition from
research settings to practical applications. The interpretability of models ensures a
collaborative approach between machine learning experts and healthcare practitioners,
fostering a relationship built on understanding and trust.
1.3 MOTIVATION
5
3. Overlapping Symptomatology:
COVID-19 and pneumonia share common respiratory symptoms, posing a diagnostic
challenge for healthcare practitioners. Distinguishing between these conditions is
crucial for appropriate treatment and resource allocation. The motivation for this project
lies in addressing this diagnostic dilemma by leveraging machine learning to discern
subtle patterns in clinical and imaging data. The aim is to develop a model that can
differentiate between COVID-19 and pneumonia with a high degree of accuracy,
contributing to more precise and tailored patient care.
6
1.4 ORGANIZATION OF PROJECT REPORT
1. Introduction:
- Introduce the context and significance of the project.
- Outline the motivation behind predicting COVID-19 and pneumonia using
machine learning.
- Define the research objectives and the anticipated contributions of the
project.
2. Literature Review:
- Provide a comprehensive overview of existing literature in the field.
- Explore early attempts at using machine learning for respiratory disease
prediction.
- Examine studies related to COVID-19 and pneumonia prediction, including
advancements in deep learning and the integration of diverse data sources.
- Highlight the gaps and trends identified in the literature that inform the
current project.
3. Methodology:
- Detail the methodology adopted for the project, including data collection,
preprocessing, and feature extraction.
- Specify the machine learning techniques selected for model development.
- Describe the dataset utilized, encompassing clinical records and medical
imaging data.
- Provide insights into the rationale behind the chosen algorithms and any
adaptations made for the specific context of COVID-19 and pneumonia
prediction.
7
- Emphasize the importance of data quality and preprocessing in enhancing
the robustness of the predictive model.
5. Model Development:
- Detail the architecture of the machine learning models chosen for
predicting COVID-19 and pneumonia.
- Explain the training process, hyperparameter tuning, and validation
procedures.
- Discuss any transfer learning approaches or fine-tuning strategies applied to
leverage pre-trained models.
- Provide insights into the interpretability and explainability of the chosen
models.
7. Discussion:
- Interpret the findings in the broader context of respiratory disease prediction
and machine learning applications in healthcare.
- Address the implications of the results for clinical practice and future
research.
- Discuss the ethical considerations, limitations, and potential biases
associated with the predictive models.
- Explore avenues for further refinement and optimization of the models.
8. Conclusion:
- Summarize the key findings and contributions of the project.
- Revisit the research objectives and assess the extent to which they have been
8
achieved.
- Emphasize the project's impact on advancing knowledge in the field of
respiratory disease prediction.
9. Future Work:
- Propose potential avenues for future research and improvement.
- Suggest enhancements to the current model architecture or exploration of
additional data sources.
- Consider the scalability and applicability of the developed models in different
healthcare settings.
10. References:
- Provide a comprehensive list of references, citing relevant literature, research
papers, and resources consulted during the project.
9
CHAPTER 2 : METHODOLOGY ADOPTED
10
accuracy, precision, recall, F1-score, and ROC-AUC. These metrics provide insights
into the model's performance in correctly classifying COVID-19, pneumonia, and
healthy cases.
11
CHAPTER 3: DESIGNING AND RESULT ANALYSIS
12
3.2 COVID-19 PREDICTION PAGE
SAFE ZONE
DANGER ZONE
13
3.3 PNEUMONIA PREDICTION PAGE
SAFE ZONE
14
DANGER ZONE
15
CHAPTER 4 : MERITS, DEMERITS AND APPLICATIONS
4.1 MERITS
1. Early Detection and Intervention:
- Machine learning models offer the potential for early detection of COVID-19 and
pneumonia, enabling healthcare professionals to intervene promptly. Timely diagnosis
enhances the chances of effective treatment and patient recovery.
3. Resource Optimization:
- Accurate prediction aids in resource optimization by helping healthcare providers
allocate resources more efficiently. This includes the timely deployment of medical
personnel, appropriate use of medical equipment, and effective utilization of hospital
facilities.
16
4.2 DEMERITS
1. Limited Generalizability:
- Machine learning models may exhibit limited generalizability, especially if trained
on a specific dataset that does not represent the diversity of global populations. This
can lead to biased predictions and hinder the applicability of the models in different
regions.
2. Interpretability Challenges:
- Deep learning models, in particular, may lack interpretability, making it challenging
for healthcare professionals to understand and trust the predictions. This lack of
transparency can be a barrier to widespread adoption in clinical settings.
17
4.3 APPLICATIONS
3. Epidemiological Surveillance:
- Machine learning models contribute to epidemiological surveillance by
analyzing patterns in large datasets. This aids in tracking the spread of
respiratory diseases and informing public health strategies.
18
CHAPTER 5 : CONCLUSION AND FUTURE SCOPE
5.1 CONCLUSION
The journey of predicting COVID-19 and pneumonia through machine learning has
been both challenging and rewarding. This minor project has endeavored to bridge the
gap between cutting-edge technology and the pressing healthcare needs posed by
respiratory diseases. As we reflect on the methodology, merits, and demerits, it becomes
evident that while strides have been made, there are areas for improvement and further
exploration.
In the realm of merits, the project has demonstrated the potential for early detection and
improved diagnostic accuracy, offering healthcare professionals valuable tools for
timely intervention. The holistic approach to diagnosis, integrating clinical and imaging
data, marks a significant step toward a comprehensive understanding of respiratory
diseases. The scalability and adaptability of the developed models hint at the possibility
of contributing to healthcare systems globally, particularly in resource-constrained
settings.
However, the demerits underscore the challenges inherent in machine learning
applications for healthcare. Limited generalizability, interpretability issues, and ethical
concerns are hurdles that demand careful consideration. The dependence on training
data quality raises questions about biases and the representativeness of datasets, urging
a closer examination of data sources and the potential impact on prediction outcomes.
As we conclude this minor project, it is crucial to acknowledge the ethical
considerations surrounding patient privacy and the responsible use of healthcare data.
The transparency of machine learning models in healthcare is paramount, and
addressing interpretability challenges remains an ongoing endeavor. Collaboration
between data scientists, healthcare professionals, and ethicists is essential to navigate
these challenges and ensure the ethical deployment of predictive models in real-world
scenarios.
The applications highlighted the transformative potential of integrating machine
learning into healthcare systems. From clinical decision support systems to
telemedicine platforms and epidemiological surveillance, the versatility of predictive
models promises to reshape the landscape of respiratory disease management. These
19
applications not only contribute to the immediate needs of healthcare but also lay the
groundwork for a future where technology plays a pivotal role in public health.
2. Enhancing Generalizability:
- Future efforts should focus on addressing the challenges related to limited
generalizability. This can be achieved by incorporating more diverse and representative
datasets, encompassing various demographics and geographic locations. Transfer
learning approaches that leverage pre-trained models on large and diverse datasets can
contribute to enhancing the generalizability of the models.
20
- The transition from a research setting to real-world implementation is a critical step.
Future endeavors should focus on seamlessly integrating predictive models into
existing healthcare systems. This involves addressing technical challenges, ensuring
compatibility with Electronic Health Record (EHR) systems, and providing training to
healthcare professionals on the practical use of machine learning predictions.
21
REFERENCES
1. Rajaraman, S., Antani, S. K., Poostchi, M., Silamut, K., Hossain, M. A., Maude,
R. J., ... & Jaeger, S. (2018). Pre-trained convolutional neural networks as
feature extractors toward improved malaria parasite detection in thin blood
smear images. PeerJ, 6, e4568.
2. Islam, M. Z., Islam, M. M., Asraf, A., & Ovais, M. (2019). A combined deep
CNN-LSTM network for the detection of novel coronavirus (COVID-19) using
X-ray images. Informatics in Medicine Unlocked, 20, 100412.
3. Wang, S., Kang, B., Ma, J., Zeng, X., Xiao, M., Guo, J., ... & Tian, J. (2020). A
deep learning algorithm using CT images to screen for Corona Virus Disease
(COVID-19). European Radiology, 30(12), 6829-6837.
4. Xie, J., Hungerford, D., Chen, H., Abrams, S. T., Li, J., & Wang, G. (2020).
Development and external validation of a prognostic multivariable model on
admission for hospitalized patients with COVID-19. medRxiv.
6. Ozturk, T., Talo, M., Yildirim, E. A., Baloglu, U. B., & Yildirim, O. (2020).
Real-time detection of COVID-19 cases using deep learning models:
Unprecedented opportunities and challenges. Journal of the Institute of
Electrical and Electronics Engineers of Turkey, 20(2), 1321-1326.
APPENDICES
app.py:
from flask import Flask, flash, request, redirect, url_for, render_template
import urllib.request
import os
from werkzeug.utils import secure_filename
import cv2
import pickle
import imutils
import sklearn
from tensorflow.keras.models import load_model
# from pushbullet import PushBullet
import joblib
import numpy as np
from tensorflow.keras.applications.vgg16 import preprocess_input
# Loading Models
covid_model = load_model('models/covid.h5')
pneumonia_model = load_model('models/pneumonia_model.h5')
# Configuring Flask
UPLOAD_FOLDER = 'static/uploads'
ALLOWED_EXTENSIONS = set(['png', 'jpg', 'jpeg'])
app = Flask(__name__)
app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 0
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
app.secret_key = "secret key"
def allowed_file(filename):
return '.' in filename and filename.rsplit('.', 1)[1] in ALLOWED_EXTENSIONS
############Routing Functions###############
@app.route('/')
def home():
return render_template('homepage.html')
@app.route('/covid')
def covid():
return render_template('covid.html')
@app.route('/pneumonia')
def pneumonia():
return render_template('pneumonia.html')
###############Result Functions############
@app.route('/resultc', methods=['POST'])
def resultc():
if request.method == 'POST':
firstname = request.form['firstname']
lastname = request.form['lastname']
email = request.form['email']
phone = request.form['phone']
gender = request.form['gender']
age = request.form['age']
file = request.files['file']
if file and allowed_file(file.filename):
filename = secure_filename(file.filename)
file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
flash('Image successfully uploaded and displayed below')
img = cv2.imread('static/uploads/'+filename)
img = cv2.resize(img, (224, 224))
img = img.reshape(1, 224, 224, 3)
img = img/255.0
pred = covid_model.predict(img)
if pred < 0.5:
pred = 0
else:
pred = 1
# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour COVID-19 test
results are ready.\nRESULT: {}'.format(firstname,['POSITIVE','NEGATIVE'][pred]))
return render_template('resultc.html', filename=filename, fn=firstname,
ln=lastname, age=age, r=pred, gender=gender)
else:
flash('Allowed image types are - png, jpg, jpeg')
return redirect(request.url)
@app.route('/resultp', methods=['POST'])
def resultp():
if request.method == 'POST':
firstname = request.form['firstname']
lastname = request.form['lastname']
email = request.form['email']
phone = request.form['phone']
gender = request.form['gender']
age = request.form['age']
file = request.files['file']
if file and allowed_file(file.filename):
filename = secure_filename(file.filename)
file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
flash('Image successfully uploaded and displayed below')
img = cv2.imread('static/uploads/'+filename)
img = cv2.resize(img, (150, 150))
img = img.reshape(1, 150, 150, 3)
img = img/255.0
pred = pneumonia_model.predict(img)
if pred < 0.5:
pred = 0
else:
pred = 1
# pb.push_sms(pb.devices[0],str(phone), 'Hello {},\nYour Pneumonia test
results are ready.\nRESULT: {}'.format(firstname,['POSITIVE','NEGATIVE'][pred]))
return render_template('resultp.html', filename=filename, fn=firstname,
ln=lastname, age=age, r=pred, gender=gender)
else:
flash('Allowed image types are - png, jpg, jpeg')
return redirect(request.url)
if __name__ == '__main__':
app.run(debug=True)