KEMBAR78
Atik | PDF | Internet Of Things | Apache Spark
0% found this document useful (0 votes)
30 views4 pages

Atik

It's ieee paper

Uploaded by

sandip
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views4 pages

Atik

It's ieee paper

Uploaded by

sandip
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

An End-to-End Architecture for Smart Waste

Management: Integrating Real-Time Data Processing,


IoT, and Machine Learning
Nermin Goran 12, Sanid Muhić 3, Alen Begović 12
1
University of Sarajevo, Obala Kulina-bana 7/II, Sarajevo, Bosnia and Herzegovina
2
BH Telecom d.d. Sarajevo, Franca Lehara 7, 71000 Sarajevo, Bosnia and Herzegovina
3
University of Zenica, Polytechnic Faculty, Fakultetska 1, Zenica, Bosnia and Herzegovina
nermin.goran@fsk.unsa.ba

Abstract—This paper presents an end-to-end architecture for cost savings by up to 13.35%, and time savings by up to
smart waste management, leveraging real-time data, IoT, AI, and 28.22%." A systematic review of architecture and smart
machine learning to optimize operational efficiency and decision- systems, frameworks, and various models for waste
making processes. The architecture is designed for both near real- management is provided in [3]. This review covers 40 research
time and batch data processing, ensuring continuous optimization papers that propose different frameworks for managing various
and adaptation of waste collection routes and resource allocation. types of waste, and it presents machine learning algorithms that
Machine learning models are employed to predict possible bad can be broadly used for this purpose. They have provided a
adverse scenarios and optimize operational plans. Additionally,
comprehensive review for various domains of waste
business intelligence is utilized for data analysis and reporting,
management such as municipal waste management, smart bin
providing actionable insights based on real-time and historical
data. The presented system is implemented on a scalable
management, household waste management, medical waste
Kubernetes infrastructure, supporting the increasing data management, construction and industrial waste management,
etc. The proposal of using IoT in combination with ML
2024 International Symposium ELMAR | 979-8-3503-7542-8/24/$31.00 ©2024 IEEE | DOI: 10.1109/ELMAR62909.2024.10694555

volumes and processing demands while maintaining system


responsiveness and efficiency. This integrated approach (Machine Learning) algorithms for waste management has been
demonstrates significant improvements in resource utilization, addressed in the papers [4-5]. Besides these works, the number
operational efficiency, and service delivery, highlighting the of papers addressing this topic is impressive. However, during
potential for smarter and more sustainable waste management our literature review, we found a lack of research that combines
practices. This research addresses the gap in combining IT IT architectures to support smart management with the use of AI
architectures with AI models and IoT, paving the way for future (Artificial Intelligence) models and IoT (internet of Things). The
advancements in smart waste management systems. parts of the architecture supporting a smart waste management
system are outlined in [6]. However, upon closer examination of
Keywords— Smart Waste Management; Real-time Data the paper, we noticed that the presented end-to-end architecture,
Processing; IoT Sensors; Machine Learning; System Architecture which encompasses all the necessary data for smart decision-
making, does not address the combination of IoT, AI, and
I. INTRODUCTION architecture. In addition to the architecture and smart monitoring
and optimization of waste management, a high-quality BI
Smart and efficient waste management system architecture
(Business Intelligence) solution is also necessary. The use of BI
has been a subject of enormous amount of research that can be
solutions, through considering business benefits, leverages data
divided on solutions with real-time processing and batch
after a certain processing, i.e., after the ETL (Extract, Transform,
processing. Regarding real-time data handling, the paper [1]
Load) process. For business managers, it is crucial to have
considers using neural networks for that purpose. The paper
specific reports of all relevant business information. In the paper
emphasizes the use of real-time data in real-time situations
[7], a proposal is given for using BI in designing various reports
through the utilization of CNN (Convolutional Neural
Networks). Authors consider the use of IoT sensors, Bluetooth to support business processes within a waste management
for application access, and sharing information to a CNN company. However, in this paper, we have not found whether BI
network. In this scenario the CNN processes the data and of smart waste management and reporting on the real-time
provides proper waste management. According to [2], the use of process of smart waste management is addressed. Therefore,
real data is advancing in several directions. One direction is the after researching, we noticed that there is the lack of works that
support for waste shape recognition systems using artificial integrate IoT, AI/ML, BI and supporting architecture.
intelligence because, as stated in the paper [2]. They claimed that Additionally, real-time vs. batch processing of data is a key step
AI enables waste identification and sorting with precision in completing the smart management process. In accordance
ranging from 72.8 to 99.95%. Another direction is towards with the above, this paper presents an end-to-end architecture for
architecture that supports transportation, logistics, cost and time smart waste management, which establishes the foundation for
savings. They concluded that "The use of artificial intelligence
in waste logistics can reduce transport distance by up to 36.8%,

979-8-3503-7542-8/24/$31.00 ©2024 IEEE

66th International Symposium ELMAR-2024, 16-18 September 2024, Zadar, Croatia


157
Authorized licensed use limited to: Global Academy Of Technology. Downloaded on May 15,2025 at 09:11:30 UTC from IEEE Xplore. Restrictions apply.
Figure 1. IT architecure for smart waste management system

utilizing real-time data to make better decisions. Beside this, the be emptied. Video stream are currently saved locally on the
architecture aims optimizing business processes that are server using the RTSP (Real-Time Streaming Protocol) protocol
generated and proposed by AI/ML algorithms. It should be from cameras mounted on the truck. Images are extracted using
emphasized that this paper also examines the dual application of a Python script and the OpenCV library and sent to the central
AI/ML algorithms. We proposed using the conventional waste management system as sequence of images. Images
application of algorithms in batch processing, but at the same extracted from the stream are used in this manner because video
time, extracting of the useful incoming information which streaming is more challenging to achieve due to the demand for
allows the use of data for AI/ML in near real-time. Incoming higher communication bandwidth and data traffic. For this
real-time data can re-optimize the routes of waste collection reason, a local server in the truck is used to collect real-time
trucks or inform system for other operational decisions. These video, extract important images which are sent to Kafka. These
numerous data from various sources are ingested into a Data images with metadata are used to specify the zone where the
Warehouse or Data Lakehouse and can be leveraged for business waste was collected. The moment when the waste is loaded from
intelligence. BI tools can be used to generate actionable insights the bin to the container is captured by the camera with this
and plans based on data collected from ordinary and new real metadata and sent to the to the central waste management
time sources that are updated daily or at other proposed time system. Additionally, various IoT sensors, independent of the
intervals. It is important to highlight that our proposed truck information, are connected to Kafka. We currently have
architecture integrates data from IoT sensors, cameras, and other IoT sensors in the bins that measure the amount of waste inside
types of sensors as real time data. This integration enables the the bin and report on the current fill level. The IoT component
use of these data, following a mini ETL process, to enhance and is entirely independently connected using MQTT (Message
optimize data previously processed in batch operations. Queuing Telemetry Transport) to network gateways, from
which the consolidated information represents a separate topic
II. END-TO-END ARCHITECTURE FOR REAL-TIME SMART in Kafka. Beside this, an application running on the central
WASTE MANAGEMENT server can notify the driver of route changes in case of arising a
problem. This real-time system is completely implemented on a
In the Fig 1, the architecture of the system that enables Kubernetes infrastructure. Spark/Flink for ML analysis, which
monitoring, optimization, and support for user experience in enables in-memory optimization, is installed within Kubernetes.
real-time waste management is presented. It is important to note The hardware beneath the Kubernetes system consists of 32
that the architecture is designed for both near real-time and batch vCPUs and 64 GB of memory. In the Fig. 1, a single Data
data processing. On the left side of the figure, there are data Lakehouse is envisioned, although the architecture can scale
sources that generate real-time data. These data are ingested into depending on increased traffic to multiple Lakehouses. Presto is
the system via topics in Kafka, using a PUB/SUB also installed on Kubernetes and is used for BI data analysis,
(Publish/Subscribe) mechanism. The data, on one hand, enter which resides in object storage accessible by both real-time and
the system and then proceed to the data transformation and batch data. After collecting data from databases (the bottom part
extraction management system. In this section, information is of the diagram), a local ETL process is conducted to extract
extracted, transformed, and loaded into memory for useful information and prepare data for batch AI/ML processes.
optimization and using machine learning algorithms. Spark or The databases include various types, such as user data, billing
Flink can be used in this step. In our solution, we have a plenty information, user locations, bins, and other external information.
of data generated in near real-time. It includes GPS (Global The current number of data sources does not limit the
Positioning System) and GIS (Geographic Information System) architecture's use. It is possible to easily scale the data sources if
data about the routes taken by transport vehicles, which also we provided that the database connectors are properly managed.
carry video information about the locations and bins that need to For ETL, Apache Airflow is planned to be used, although other

66th International Symposium ELMAR-2024, 16-18 September 2024, Zadar, Croatia


158
Authorized licensed use limited to: Global Academy Of Technology. Downloaded on May 15,2025 at 09:11:30 UTC from IEEE Xplore. Restrictions apply.
ETL tools could also be utilized. Once the data arrives in In accordance with Fig 2., user plane integrates various data
Airflow for process orchestration, they are transformed and inputs, ensuring a comprehensive collection of relevant
prepared via Spark into appropriately transformed Iceberg tables information. This includes real-time data from GPS and GIS
on S3. systems, sensor data from IoT devices in waste bins, and visual
data from cameras mounted on trucks. In the control and data
The architecture in batch section consists of data sources processing plane, advanced ETL workflows orchestrated by
from various databases: MySQL, PostgreSQL, MongoDB, CSV Apache Airflow manage the extraction, transformation, and
files, etc. Airflow DAGs are used to define the ETL processes. loading of data. Apache Kafka is used to handle real-time data
Spark is used for transforming and loading data into Iceberg streams, while Apache Spark processes large datasets for
tables, which are stored in S3 as a large-scale data storage analysis and transformation. The application plane employs
solution. Apache Iceberg enables efficient access to large machine learning models developed using frameworks like
datasets, which is beneficial for ML models that often require TensorFlow and Spark MLlib. These models are trained to
processing and analyzing substantial amounts of data. To use predict waste collection needs, optimize routes, and improve
Iceberg tables for machine learning, the process begins by resource allocation, ensuring efficient operations. Presto and
accessing data from Iceberg tables using Spark, although Trino other BI tools provide powerful data analytics capabilities. They
or Flink can also be used. Next, the data is prepared for ML
generate reports and dashboards that offer insights into
through cleaning and preprocessing, followed by splitting the operational efficiency, resource utilization, and areas for
data into training and test sets. improvement, aiding decision-making processes. The entire
ML models are then built and trained using libraries such as architecture is deployed on a scalable Kubernetes infrastructure,
Spark MLlib, scikit-learn, and TensorFlow. After training, the which allows for easy scaling of computational resources as data
models are evaluated and tuned on the test set to improve their volumes and processing demands grow. This ensures that the
accuracy and efficiency. Finally, plans for the next time interval system remains responsive and efficient under varying loads.
are implemented, and the proposal is sent to the plan database.
After the plans and tasks for daily, weekly, or monthly IV. USE CASE AND DISCUSSION
operations are generated, they are distributed to real-time data
sources (trucks, workers, and machines). Routes are determined
based on information from the previous day, vehicle fuel
consumption, bin fill levels, user priority, or any other relevant
information found in external and internal databases. The
process then repeats. After batch processing, the collected real-
time data can influence the optimized data from the previous
phase in case any problems arise. This approach integrates
various real-time data sources, processes data with ETL
workflows, and utilizes AI/ML fast techniques to optimize and
support waste management operations in real time.

III. ARCHITECTURE PLANES


In Figure 2, the logical architecture consists of several
planes. The identifiable planes are the user plane, the control and
Figure 2. Identified planes of smart management system
data processing plane, and the application plane.
The question that arises is the purpose of real-time data. As
• User Plane: This foundational layer is based on data observed in the previous discussion, we collect information
about users, workers, machines, and trucks. It during a single cycle and optimize it at the end of the day. The
aggregates data from all sources, including IoT systems, model is trained on newly arrived data, tested, and if it passes
user information (needs, interests, priorities), cameras, the tests, it is deployed to production. Once populated with real
worker data, fuel consumption, and similar information. data, it provides results for optimized routes and resource
This layer includes all data sources and applications allocation for the next day, week, or month. After this process,
accessed by users. it starts again the next day.
• Control and Data Processing Plane: This plane manages In the Fig. 3., we can see a proposal for three trucks that need
control data and tools used for ETL processes, data to collect waste from a specific area. The red route is the normal
analysis, message routing, data extraction, data optimized route. The blue route is also an optimized route for
transformation, and various types of data storage. the blue truck. The black route experiences a failure, and the red
• Application Plane: This top layer represents the route, which is closer according to the algorithm, reroutes traffic
intelligence that uses AI/ML tools, BI tools, or simple to pick up waste from the truck that, for example, broke down.
information from web servers or similar sources to At the same time, after issuing, the blue truck takes over the path
optimize data, generate actions, or provide simple of the black truck, as well as its nearest bins. All this is recorded
information to users. as information for the next day's plan and potentially as a reward
for the other drivers. The truck experiencing issues
automatically reports to the center about the problem, and a

66th International Symposium ELMAR-2024, 16-18 September 2024, Zadar, Croatia


159
Authorized licensed use limited to: Global Academy Of Technology. Downloaded on May 15,2025 at 09:11:30 UTC from IEEE Xplore. Restrictions apply.
Figure 3. Use case scenario of changing route

repair service is activated. The service will immediately more sustainable waste management practices. Further research
determine the severity of the issue and estimate when the truck and development in integrating these technologies will continue
can be operational again. At the end of the day, optimization is to enhance the capabilities and effectiveness of waste
performed again in a batch process with this new information. management systems

ACKNOWLEDGMENT
V. CONCLUSION This work was supported by the Federal Ministry of
Education and Science, Federation of Bosnia and Herzegovina,
In conclusion, the research and development of smart and Bosnia and Herzegovina.
efficient waste management system architecture have shown
significant advancements, especially concerning real-time data
handling and the integration of various technologies such as IoT, REFERENCES
AI, and machine learning. The use of CNN neural networks to [1] G. Eason, B. Noble, and I. N. Sneddon, "On certain integrals of Md.
process real-time data has proven effective in providing insights Wahidur Rahman, Rahabul Islam, Arafat Hasan, Nasima Islam Bithi, Md.
into waste management, enhancing both operational efficiency Mahmodul Hasan, Mohammad Motiur Rahman,Intelligent waste
management system using deep learning with IoT, Journal of King Saud
and decision-making processes. The literature review revealed University - Computer and Information Sciences, Volume 34, Issue 5,
several promising directions, such as AI-supported waste shape 2022, Pages 2072-2087
recognition systems, which improve sorting accuracy, and [2] Fang, B., Yu, J., Chen, Z. et al. Artificial intelligence for waste
intelligent logistics, which significantly reduces transportation management in smart cities: a review. Environ Chem Lett 21, 1959–1989
distance, costs, and time. However, there is still a noticeable gap (2023). https://doi.org/10.1007/s10311-023-01604-3
in research that combines IT architecture with AI models and [3] Sinthiya, N.J., Chowdhury, T.A., Haque, A.K.M.B. (2022). Artificial
IoT to support comprehensive smart waste management. This Intelligence Based Smart Waste Management—A Systematic Review. In:
Lahby, M., Al-Fuqaha, A., Maleh, Y. (eds) Computational Intelligence
paper presented an end-to-end architecture that addresses this Techniques for Green Smart Cities. Green Energy and Technology.
gap by integrating real-time data sources, control and data Springer, Cham. https://doi.org/10.1007/978-3-030-96429-0_3
processing layers, and an application layer. The architecture [4] Mookkaiah, S.S., Thangavelu, G., Hebbar, R. et al. Design and
leverages Apache Kafka for data ingestion, Apache Airflow and development of smart Internet of Things–based solid waste management
Spark for ETL processes, and advanced machine learning system using computer vision. Environ Sci Pollut Res 29, 64871–64885
frameworks like TensorFlow and Spark MLlib for real-time and (2022). https://doi.org/10.1007/s11356-022-20428-2
batch data processing. Additionally, business intelligence tools [5] M. K. Hasan, M. A. Khan, G. F. Issa, A. Atta, A. S. Akram and M. Hassan,
"Smart Waste Management and Classification System for Smart Cities
such as Presto enable detailed data analysis and reporting, using Deep Learning," 2022 International Conference on Business
providing actionable insights for decision-makers. The proposed Analytics for Technology and Security (ICBATS), Dubai, United Arab
architecture ensures efficient waste management operations by Emirates, 2022, pp. 1-7, doi: 10.1109/ICBATS54253.2022.9759087.
continuously optimizing routes, monitoring bin fill levels, and [6] Pardini, K., Rodrigues, J. J., Diallo, O., Das, A. K., de Albuquerque, V.
adjusting plans based on real-time and batch-processed data. H. C., & Kozlov, S. A. (2020). A smart waste management solution
The scalable Kubernetes infrastructure supports the growing geared towards citizens. Sensors, 20(8), 2380.
data volumes and processing demands, ensuring system [7] Strand, M., & Syberfeldt, A. (2020). Using external data in a BI solution
to optimise waste management. Journal of Decision Systems, 29(1), 53–
responsiveness and efficiency. Overall, this comprehensive 68. https://doi.org/10.1080/12460125.2020.1732174M. Young, The
approach to smart waste management demonstrates the potential Technical Writer's Handbook. Mill Valley, CA: University Science, 1989.
for significant improvements in resource utilization, operational
efficiency, and service delivery, paving the way for smarter and

66th International Symposium ELMAR-2024, 16-18 September 2024, Zadar, Croatia


160
Authorized licensed use limited to: Global Academy Of Technology. Downloaded on May 15,2025 at 09:11:30 UTC from IEEE Xplore. Restrictions apply.

You might also like