Face Mask Detection
Using Machine Learning & Deep Learning
Project Report
Submitted by
Saiyam Jain. Roll: 41218002717
Mayank Goyal. Roll: 35118002717
Abhishek Aswal. Roll: 43318002717
Deepak Singh. Roll: 42918002717
Under The Guidance Of
Ms. Upasna Joshi
Assistant Professor
Department of Computer Science & Engineering
1
CONTENTS
1. INTRODUCTION .......................................................................................................................... 3
2. BACKGROUND OF THE STUDY ............................................................................................... 4
2. LITERATURE SURVEY .............................................................................................................. 6
3. SYSTEM ARCHITECTURE ......................................................................................................... 7
4. RESULTS....................................................................................................................................... 8
5. SUMMARY ................................................................................................................................. 10
6. REFERENCES ............................................................................................................................. 11
2
1. INTRODUCTION
Corona Virus was originated in Wuhan, China at the end of 2019. Since then, it has been spreading
like a wild fire in a forest. Millions have been affected and around 1,668,356 have unfortunately
passed away as on 18th of December 2020, almost a year since this virus came to existence. People
who have this illness can take up to 2 weeks to cure, with the risk of having to suffer additional
medical problems caused by it. Children and old people have proved to be at the highest risk to
contract the disease, which may even result in death. Hence, it has been made a priority to contain
the virus than to cure it. The virus spreads through the air, transmitted by one person to another not
only by touch, but also by speaking and coughing. The concern was put forward to WHO (World
Health Organization) which suggested that face masks and social distancing is the answer to it,
until a cure is invented. Putting a face mask on can reduce the risk of getting infected by a great
extent, not only to the one wearing it but also to the others that he comes in contact with. Wearing
masks every time we go out is something we can do with little effort that can effectively save lives,
and that is precisely why it is in so much demand at this point of time.
In this paper, we propose a Face Mask Detection project which consists of 2 phases, namely
training and deployment. The first stage detects human faces, while the second phase uses deep
learning to firstly, identify the ROI(Region Of Interest) being the person’s face and secondly
classify the faces detected in the first stage as either ‘Mask’ or ‘No Mask’ faces and draws
boundary of colors either green or red, depending on the output. The project takes JPG and PNG
files as inputs, but it has also been tested on videos. The project can give accurate results if set up
with a CCTV camera to track people without masks to ensure the safety and wellbeing of others,
thus help controlling the spread of the virus.
We have also created a website which allows anyone to either run the code online directly or
download the android application through which face mask detection can be started.
3
2. Background of the study
Object detection.
Object detection is a computer vision technique that allows you to locate and locate objects in
an image or video. With such identification and localization, object detection is used to count
objects in a scene and to locate and track their exact locations, while all are labeled correctly.
The algorithm generates an axis-aligned boundary box showing a list of object categories in
the image and the position and level of each instance of each object category.
CNN
CNN plays an important role in computer vision prototype recognition because of its superior
spatial capability Extraction capacity and low computational cost. CNN uses decision kernels
to interact with the original Image or feature maps to remove high-end features. However, how
to properly design a never-ending network neural Architecture is still a fundamental question.
The installation network allows you to find out the proposal network. The best combination of
kernels. To train very deep neural networks, K. Heet Al and others proposed residual network
(Resnet), which can learn identity mapping from the previous layer. The object detector is
usually set Mobile networks (mobile net) are mobile or embedded devices whose computing
resources are very limited [29]. Proposed. It uses in-depth discernment to capture features and
channel wise resolutions according to channel numbers. Therefore the computational cost of
mobile net is much lower than networks that use standard resolutions.
Machine learning
Machine learning algorithms are used in a wide variety of applications, such as email filtering
and computer vision, where it is difficult or infeasible to develop conventional algorithms to
perform the needed tasks. Machine learning is closely related to computational statistics,
which focuses on making predictions using computers. Data mining is a related field of study,
focusing on exploratory data analysis through unsupervised learning. In its application across
business problems, machine learning is also referred to as predictive analytics.
Deep Learning
Deep learning methods aim at learning feature hierarchies with features from higher levels of
the hierarchy formed by the composition of lower level features. Automatically learning
features at multiple levels of abstraction allow a system to learn complex functions mapping
the input to the output directly from data, without depending completely on human-crafted
features. Deep learning allows computational models that are composed of multiple processing
layers to learn representations of data with multiple levels of abstraction.
MobileNetV2
MobileNetV2 is a state of the art for mobile visual recognition including classification, object
4
detection and semantic segmentation. This classifier uses Depth wise Separable Convolution
which is introduced to dramatically reduce the complexity cost and model size of the network,
and hence is suitable to Mobile devices, or devices that have low computational power. In
MobileNetV2, another best module that is introduced is inverted residual structure. Non-
linearity in narrow layers is deleted. Keeping MobileNetV2 as backbone for feature extraction,
best performances are achieved for object detection and semantic segmentation.
HTML
The HyperText Markup Language, or HTML is the standard markup language for documents
designed to be displayed in a web browser. It can be assisted by technologies such as
Cascading Style Sheets (CSS) and scripting languages such as JavaScript.
Web browsers receive HTML documents from a web server or from local storage and render
the documents into multimedia web pages. HTML describes the structure of a web page
semantically and originally included cues for the appearance of the document.
CSS
Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of
a document written in a markup language such as HTML. CSS is a cornerstone technology of
the World Wide Web, alongside HTML and JavaScript.
CSS is designed to enable the separation of presentation and content, including layout, colors,
and fonts. This separation can improve content accessibility, provide more flexibility and
control in the specification of presentation characteristics, enable multiple web pages to share
formatting by specifying the relevant CSS in a separate .css file which reduces complexity and
repetition in the structural content as well as enabling the .css file to be cached to improve the
page load speed between the pages that share the file and its formatting.
Android Studio
Android Studio is the official integrated development environment (IDE) for Google's Android
operating system, built on JetBrains IntelliJ IDEA software and designed specifically for
Android development. It is available for download on Windows, macOS and Linux based
operating systems or as a subscription-based service in 2020. It is a replacement for the Eclipse
Android Development Tools (E-ADT) as the primary IDE for native Android application
development.
Javascript
JavaScript often abbreviated as JS, is a programming language that conforms to the
ECMAScript specification. JavaScript is high-level, often just-in-time compiled, and multi-
paradigm. It has curly-bracket syntax, dynamic typing, prototype-based object-orientation, and
first-class functions.
5
Alongside HTML and CSS, JavaScript is one of the core technologies of the er 97% of
websites use it client-side for web page behavior often incorporating third-party
libraries. All major web browsers have a dedicated JavaScript engine to execute the
code on the user's device.
6
3. Literature Survey
Deep learning techniques are useful for big data analysis and include applications in computer
vision, design and speech recognition. After reading Z. Wang, G. Wang, (2020), “Masked face
recognition dataset and application we recognize that this work will focus on some of the most
commonly implemented intensive learning architectures and their applications. Auto-encoder, good
neural networks, Boltzmann machines, Deep Trust networks are the networks presented in detail.
Deep learning can be used in un-enhanced learning algorithms to process unplugged data.
Previously, Khandelwal in his research work (2020) had stated in his work about a deep learning
model that binaries an image as a mask is used or not mask. 380 images had a mask and 460
images had no mask and these images were used in the training of the MobileNetV2 model.
Qin B. and Li D. has done a face mask recognition project that focuses on capturing real-time
images indicating whether a person has put on a face mask or not. The dataset was used for training
purposes to detect the main facial features (eyes, mouth, and nose) and for applying the decision-
making algorithm. Putting on glasses showed no negative effect. Rigid masks gave better results
whereas incorrect detections can occur due to illumination, and to objects that are noticeable out of
the face.
7
3. SYSTEM ARCHITECTURE
This system aims at classifying whether a person is wearing a mask or not by taking input from
images real time streaming videos. We have taken a total of 3847 images in our Face Mask
Detection Dataset belonging to two labels i.e. with mask: 1917 images and without mask: 1930
images. The classification of the images is done by training the model in 2 phases:
Phase 1: Training- Training the model on the dataset using Tensorflow & Keras with classifier like
MobileNetV2 is used to generate a trained model.
Phase2: Deployment - Loading the trained model and applying detector over images/live video
stream
8
4. Results
We created a face mask detector using Deep Learning, Keras, Tensorflow and OpenCV. We trained
it to distinguish between people wearing mask and people not wearing a mask We have used
MobileNet V2 classifier with the ADAM optimizer for the best result.
Some images of with mask dataset
Some images of without mask dataset
Test Outputs
Output of Face Mask Detector in Uploaded Image
9
The accuracy of the model is calculated to be 98%.
It is observed that performance of ADAM optimizer is good in both training and testing.
Accuracy/Loss Plot
10
Website
The website can be accessed by the url https://mgoyal1903.github.io/Covid-Website/ where a
person can either run the code directly online or download the application. The development of the
website was important so that our project can reach out to people to be put in good use for the
society.
11
5. SUMMARY
As the technology are blooming with emerging trends the availability so we have novel face mask
detector which can possibly contribute to public healthcare. The architecture consists of Mobile Net
as the backbone it can be used for high and low computation scenarios. In order to extract more
robust features, we utilize transfer learning to adopt weights from a similar task face detection,
which is trained on a very large dataset. We used OpenCV, Tensorflow, Keras , Pytorch and CNN
to detect whether people were wearing face masks or not. The models were tested with images and
real-time video streams. The accuracy of the model is achieved and, the optimization of the model
is a continuous process and we are building a highly accurate solution by tuning the hyper
parameters. This specific model could be used as a use case for edge analytics. Furthermore, the
proposed method achieves state-of-the-art results on a public face mask dataset. By the
development of face mask detection we can detect if the person is wearing a face mask and allow
their entry would be of great help to the society.
12
6. REFERENCES
[1] S. Feng, C. Shen, N. Xia, W. Song, M. Fan, and B. J. Cowling, “Rational use of face masks
in the covid-19pandemic,”The Lancet Respiratory Medicine, 2020.
[2] Y. Fang, Y. Nie, and M. Penny, “Transmission dynamics of the covid-19 outbreak and
effectiveness of government interventions: A data-driven analysis, “Journal of medical
virology, vol. 92, no. 6, pp. 645–659, 2020.
[3] Z. Wang, G. Wang, B. Huang, Z. Xiong, Q. Hong, H. Wu, P. Yi, K. Jiang, N. Wang, Y.
Peiet al., “Masked face recognition dataset and application” arXiv preprint
arXiv:2003.09093, 2020.
[4] Z.-Q. Zhao, P. Zheng, S.-t. Xu, and X. Wu, “Object detection with deep learning: A
review”, IEEE transactions on neural networks and learning systems, vol. 30, no. 11, pp.
3212–3232, 2019.
[5] Qin B. and Li D. , Identifying face mask wearing condition using image super-resolution
with classification network to prevent COVID-19.doi: 10.21203/rs.3.rs-28668/v1
[6] Erhan, D., Szegedy, C., Toshev, A., and Anguelov, D. (2014). “Scalable object detection
using deep neural networks,” in Computer Vision and Pattern Recognition Frontiers in
Robotics and AI
[7] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate
object detection and semantic segmentation,” in Proceedings of the IEEE conference on
computer vision and pattern recognition, 2014.
[8] A. Kumar, A. Kaur, and M. Kumar, “Face detection techniques: a review,”Artificial
Intelligence Review, vol. 52,no. 2, pp. 927–948, 2019.D.-H. Lee, K.-L. Chen, K.-H. Liou,
C.-L. Liu, and J.-L. Liu, “Deep learning and control algorithms of direct perception for
autonomous driving, 2019.
13
14