KEMBAR78
Chapter 4 Big Data | PDF | Data Analysis | Analytics
0% found this document useful (0 votes)
53 views31 pages

Chapter 4 Big Data

Uploaded by

Việt Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views31 pages

Chapter 4 Big Data

Uploaded by

Việt Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Ho Chi Minh City Industry 4.

0 technologies in Mechanical Engineering


University of Technology

Course : Industry 4.0 Technologies in Mechanical Engineering

Lecturer : PhD Tran Quang Phuoc


PhD. Quang-Phuoc Tran
Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

Chapter 3: Internet of Things (IoT)- Big data - Digitalize

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks

A wireless sensor network (WSN) is a group of


specialized transducers with a communications
infrastructure for monitoring and recording
conditions at diverse locations. The commonly
monitored parameters are temperature, humidity,
pressure, wind direction and speed, illumination
intensity, vibration intensity, sound intensity,
power-line voltage, chemical concentrations,
pollutant levels and vital body functions.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks

Wireless communication Protocols PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks


A sensor network consists of multiple detection stations called sensor nodes, each of which is
small, lightweight and portable. Every sensor node is equipped with a transducer, microcomputer,
transceiver and power source. The transducer generates electrical signals based on sensed physical
effects and phenomena. The microcomputer processes and stores the sensor output. The
transceiver receives commands from a central computer and transmits data to that computer.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks


High priority for data safety
Internet-based remote maintenance
connections to SICK are established
exclusively by users themselves via
highly encrypted data channels in line
with HTTPS and SSH authentication
standards. The SSH protocol ensures
that the data is encrypted with 256-bit
encryption while passing from the
customer to the service technician, so
external access to the customer
network is virtually impossible. In
addition, all communication by
service technicians in the cloud is
logged to create further transparency. PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks


In computer science and telecommunications, WSNs are an active research area, with
numerous workshops and conferences arranged each year.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks

Characteristics

• Power consumption constraints for nodes using batteries or energy harvesting


• Ability to cope with node failures (resilience);
• Some mobility of nodes (for highly mobile nodes, see mobile WSNs);
• Heterogeneity of nodes;
• Homogeneity of nodes;
• Scalability to large scale of deployment;
• Ability to withstand harsh environmental conditions;
• Ease of use;
• Cross-layer design.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks


There are several different types of wireless transmission technologies, each with its own advantages and
limitations. Some of the most commonly used wireless transmission technologies include:

1.Radio Frequency (RF) transmission: This is the most widely used wireless transmission technology and
is used in applications such as Wi-Fi networks, Bluetooth devices, and mobile phones. RF transmission uses
radio waves in the range of 3 kHz to 300 GHz to transmit data over short or long distances.

2.Infrared (IR) transmission: This technology uses infrared radiation to transmit signals between devices.
IR transmission is commonly used in remote controls for televisions, DVD players, and other devices.
3.Bluetooth: Bluetooth is a short-range wireless transmission technology that uses low-power radio waves
to connect devices such as smartphones, laptops, and speakers. Bluetooth operates in the 2.4 GHz
frequency band.

4.Near Field Communication (NFC): NFC is a short-range wireless transmission technology that allows two
devices to communicate with each other when they are in close proximity. NFC is commonly used in
contactless payment systems and access control systems.

5.Satellite communication: This technology uses satellites in orbit around the Earth to transmit signals over
long distances. Satellite communication is commonly used for television and radio broadcasting, GPS
navigation, and military communications. PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks

There are several different methods of industrial data transmission, including:


1.Wired transmission: This involves using cables to transmit data between devices, such as Ethernet or
serial cables.
2.Wireless transmission: This involves using wireless technologies such as Wi-Fi, Bluetooth, Zigbee, or
cellular networks to transmit data wirelessly between devices.
3.Powerline communication (PLC): This method uses existing electrical wiring to transmit data between
devices.
4.Industrial Ethernet: This is a type of Ethernet specifically designed for industrial applications, such as the
use of ruggedized connectors and cabling.
5.Fieldbus: This is a communication protocol used for connecting field devices, such as sensors and
actuators, to control systems.
6.Modbus: This is a popular protocol used for transmitting data between devices in industrial automation
systems.
7.Profibus: This is another popular protocol used for communication between field devices and control
systems.
8.HART: This protocol allows two-way communication between smart field devices and control systems,
providing additional diagnostic and monitoring capabilities.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Science

Data science is the process of examining data sets to conclude the information they contain,
increasingly with the aid of specialized systems and software, using techniques, scientific
models, theories, and hypotheses. These three pillars have very much been the mainstay of
data science ever since it started getting embraced by businesses over the past two decades
and should continue to be even in the future.

Data Science expressed like an idea accepted in academia and industry. It’s an intersection of
programming, analytical, and business skills that allows extracting meaningful insights from
data to benefit business growth. However, this is used in social research, scientific & space
programs, government planning, and so on.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Connection : Sensor and networks


Methods, Models, Process are
Computer Science & IT practice
defined as industry and
is the full range of hardware, the The Data Science model.
academia proved practices that
software involved in providing
are the backbone to Data
computing for processing data,
Science, including Mathematical
storage for storing and sharing
models, theorems, Statistical
data and networking for collecting
methods, techniques, and
and movement.
process methodologies likes
CRISP-DM, Six-Sigma, Lean,
and so on.

Business Acumen in its purest form means running a Business Enterprise. Any business
existing to sell its product or services for a profit incurring some cost and generally having
the functions like HR, Supply Chain, Finance, Sales & marketing to support it.
PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Analytics
The Analytics Advancement Model helps
define, identify and illustrate what these
types of analysis mean. In the above
model, we can visualize four types of
analysis possible and show them in terms
of complexity of analysis and volume of
analysis. Volume here means done often.
There is no apparent relationship between
volume and complexity.

Analytics advancement model.


PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Analytics
+The descriptive analysis is termed as the first step in any analytical problem-solving project. It is the simplest to
perform in the analysis ladder of knowledge. As a foundational analysis, it aims to answer the question “what
happened?”
+The diagnostic analysis delves a little deeper to answer the question “why it happened?” and helps discover
historical context through data. Continuing with the previous context, the question of “how effective was a
promotional campaign based on the response in different geographies?” This type of analysis can help to identify
causal relationships and anomalies in the data.
+The predictive analysis is a little more complicated than the previous two discussed and answers “what can
happen?” meaning looking into the future. The results from a predictive analysis should be treated as an estimate
of the chance or probability of occurrence of that event. Widely used, a few examples are what the sales volume
will be for the next time period? What is the propensity to buy for a new product release? Should I offer a loan to a
particular applicant or no? This form of analysis uses knowledge and patterns from historical data to predict the
future. In a world of uncertainty that businesses operate in, this is a very powerful tool to plan for the future.
+The prescriptive analysis is almost the other end of the ladder, answering the question “how can it happen?”
For example, businesses need the advice to understand the future course of action to take from all the available
alternatives based on potential return and prescriptive analysis. For example, to achieve the outcome of a specific
sale, it can suggest an alternative mix of investing in various types of promotions or media for advertising. This will
be discussed more in-depth later with applications in supply chain, sales and marketing, and HR functions.
PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Analytics
Process of Descriptive Analytics

• Define business metrics: Determine which metrics are important for evaluating performance against business
goals. Goals include to increase revenue, reduce costs, improve operational efficiency and measure productivity.
Each goal must have associated key performance indicators (KPIs) to help monitor achievement.
• Identify data required: Data are located in many different sources within the enterprise, including systems of
record, databases, desktops and shadow IT repositories. To measure data accurately against KPIs, companies must
catalog and prepare the correct data sources to extract the needed data and calculate metrics based on the current
state of the business.
• Extract and prepare data: Data must be prepared for analysis. Deduplication, transformation and cleansing are a
few examples of the data preparation steps that need to occur before analysis. This is often the most time-
consuming and labor-intensive step, requiring up to 80% of an analyst’s time, but it is critical for ensuring accuracy.
• Analyze data: Data analysts can create models and run analyses such as summary statistics, clustering and
regression analysis on the data to determine patterns and measure performance. Key metrics are calculated and
compared with stated business goals to evaluate performance based on historical results. Data scientists often use
open source tools such as R and Python to programmatically analyze and visualize data.
• Present data: Results of the analytics are usually presented to stakeholders in the form of charts and graphs. This
is where data visualization comes into play. Business intelligence tools give users the ability to present data visually
in a way that non-data analysts can understand. Many self-service data visualization tools also enable business
users to create their own visualizations and manipulate the output. PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Analytics
Diagnostic Analytics

At this stage, historical data can be measured against other data to answer the question of why
something happened. Thanks to diagnostic analytics, there is a possibility to drill down, to find out
dependencies and to identify patterns. Companies go for diagnostic analytics, as it gives a deep
insight into a particular problem. At the same time, a company should have detailed information at
their disposal, otherwise data collection may turn out to be individual for every issue and time-
consuming.
Let’s take another look at the examples from different industries: a healthcare provider compares
patients’ response to a promotional campaign in different regions; a retailer drills the sales down to
subcategories. Another flashback to our BI projects: in the healthcare industry, customer
segmentation coupled with several filters applied (like diagnoses and prescribed medications)
allowed measuring the risk of hospitalization.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Analytics

Predictive Analytics Process

1. Define project: It defines the project


outcomes, deliverables, scoping of the
effort, business objectives and identifies
the data sets to be used.
2. Data collection: Data mining for
predictive analytics prepares data from
multiple sources for analysis.
3. Data analysis: Data analysis is the
process of inspecting, cleaning,
transforming and modeling data with the
objective of discovering useful information
and arriving at conclusions.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Analytics
4. Statistics: Statistical analysis validates the
assumptions and hypotheses and tests them using
standard statistical models.
5. Modeling: Predictive modeling provides the
ability to automatically create accurate predictive
models about the future. There are also options to
choose the best solution with multi-model
evaluation.
6. Deployment: Predictive model deployment
provides the option to deploy the analytical results
in the everyday decision-making process to obtain
results, reports and output by automating the
decisions based on the modeling.
7. Model monitoring: Models are managed and
monitored to review the model performance to
ensure they are providing the results expected.
PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Data Analytics
Process of Prescriptive Analysis
• Build a business case: Prescriptive analytics are best used when data-driven decision-making goes
beyond human capabilities, such as when there are too many input variables, or data volumes are
high. A business case will help identify whether machine-generated recommendations are
appropriate and trustworthy.
• Define rules: Prescriptive analytics require rules to be codified that can be applied to generate
recommendations. Business rules thus need to be identified and actions defined for each possible
outcome. Rules are decisions that are programmatically implemented in software. The system
receives and analyzes data, then prescribes the next best course of action based on predetermined
parameters. Prescriptive models can be very complex to implement. Appropriate analytic techniques
need to be applied to ensure that all possible outcomes are considered to prevent missteps. This
includes the application of optimization and other analytic techniques in conjunction with rules
management.
• Test, Test, Test: As the intent of prescriptive analytics is to automate the decision-making process,
testing the models to ensure that they are providing meaningful recommendations is imperative to
prevent costly mistakes.
PhD. Quang-Phuoc Tran
Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

Definition

Big Data is defined as a tool and platform that is used to store, process, and analyze
data to identify business insights that were not possible due to the limitation of the
traditional data processing and management technologies. Big Data is also viewed
as a technology for processing huge datasets in distributed scalable platforms.

Big Data Applications in Industry 4.0 Data that cannot be stored and processed in
commodity hardware and greater than one terabyte is called Big Data. The existing
commodity hardware size of computing is only one terabyte where processing and
storage of data are limited.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

7V in Bigdata:
- Volume: Volume represents the amount of data that is growing at an exponential rate, i.e. in
Petabytes and Exabytes.
- Velocity: Velocity refers to the speed at which data is growing, very fast. Today, yesterday's data is
considered stale data. Today, social media is a big contributor to the growth of data.
- Variety: Variety refers to the heterogeneity of data types. In other words, the data collected comes
in many formats like video, audio, csv, etc. So these different formats represent many types of
data.
- Veracity: Veracity refers to doubtful or uncertain data of available data due to inconsistent and
incomplete data. Available data can sometimes be messy and hard to trust. With many forms of big
data, quality and accuracy are difficult to control. Volume is often the reason behind the lack of
quality and accuracy of the data.
- Validity: The fifth V denotes the validity of data that is essential in business to identify the validity
of data patterns for planning business strategies.
- Virality: The sixth V denotes the virality aspect of data that is generally used to measure the reach
of data.
- Value: All is well and good to have access to big data but unless we can turn it into a value.
PhD. Quang-Phuoc Tran
Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

5V in Bigdata:

PhD. Quang-Phuoc Tran


Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

Sample :
The characteristics of Big Data for the coronavirus pandemic is mapped below.
+Volume: Huge volume of data is evolved every hour related to a patient affected, illness conditions,
precaution measures, diagnosis, and hospital facilities.
+Velocity: The information about people affected and the ill effects of COVID- 19 is streaming in
nature which is evolving dynamically.
+Variety: Huge volume of data related to COVID 19 is accumulated as structured data in patient
database, demographics of citizens, clinical diagnosis, travel data, genomic studies, and drug targets.
Unstructured data for COVID- 19 is voluminous in social media platforms of Twitter, Facebook, and
WhatsApp to share preventive measures in the form of text, audio, video, and related chats. Role of Big
Data Analytics
+Veracity and Virality: The information of preventive cure mechanism mentioned in social media
platforms are inconsistent and viral leading to uncertainty among people.
+Validity and Value: Measuring the validity and the value of the content available in the digital globe
for the pandemic has become a challenge.

PhD. Quang-Phuoc Tran


Ho Chi Minh City
University of Technology
Big data Industry 4.0 technologies in Mechanical Engineering

To create big data for a manufacturing process, you can follow these steps:
1.Data Collection: Collect data from various sources such as sensors, machines,
production systems, and databases. This data can include production data,
machine performance data, quality control data, and supply chain data. When
using big data in production, the data integration process is critical to ensure that
the data is correctly formatted and usable by big data technologies.
2.Data Integration: Integrate the data from different sources into a centralized
repository such as a Hadoop cluster or a data lake.
Step 1: Collect data from different sources
Step 2: Standardize data
Step 3: Integrate data
Step 4: Ensure data consistency and format
Step 5: Store data
PhD. Quang-Phuoc Tran
Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

3. Data Processing:

- Data processing is the process of processing data to convert source data into
useful and reliable information. It includes activities such as collecting,
storing, organizing, classifying, calculating, analyzing, and presenting
information in reports, charts, or other formats.
- Data processing can be performed using various means and tools, including
data processing software, database systems, data query tools, algorithms and
processing techniques. different data. The purpose of data processing is to
help managers, researchers or other organizations find useful information
from data and make the right decisions.
- Use big data technologies such as Apache Spark or Apache Flink to
process and lean the data, identify patterns and anomalies.
PhD. Quang-Phuoc Tran
Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

4. Data Analysis:

Data analysis is the process of examining, cleaning, transforming and modeling


data with the purpose of discovering useful information, drawing conclusions
and supporting decision making. Data analysis uses analytical and logical
reasoning to interpret information obtained from data.

Data analysis can be applied to many different fields such as business, science,
health, education, politics... Each field has its own goals and methods of data
analysis. However, in general, data analysis aims to solve specific problems
using existing or newly collected data.

PhD. Quang-Phuoc Tran


Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

Analyze the data using tools such as Apache Hive, Apache Impala, or Apache
Drill to gain insights into the manufacturing process and make data-driven
decisions.

Data analysis methods :


+Statistical analysis
+Regression analysis
+Classification analysis
+Mining analysis
+Machine Learning

PhD. Quang-Phuoc Tran


Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

5. Data Visualization:

A technique of representing data in the form of images, graphs, and charts in an


intuitive, easy-to-understand way to clearly convey information from the data to
readers and user. Instead of keeping the data in spreadsheet form, we convert it
into charts and dashboards so it can be read and understood more easily.

Common types of data visualization formats: Column chart, Bar chart, Line
graph, Two-axis chart, Mekko chart, Pie chart, Bubble chart, Domain chart,
Scatter chart, Heat map, Scatter plot diagram, Area chart….

Visualize the results of the data analysis using tools such as Apache Zeppelin,
Tableau, or PowerBI.
PhD. Quang-Phuoc Tran
Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

5. Data Management:

- Data management is the process of managing data within an organization or


system, including collecting, organizing, storing, processing and protecting
data. It includes activities such as data entry, data processing, data backup and
recovery, and data security.

- Data management plays an important role in modern organizations and


information systems because it helps ensure that data is managed effectively,
reliably and is available when needed. A good data management also helps
improve an organization's ability to access and use data, improve data quality,
minimize security risks and comply with regulations related to data management.

PhD. Quang-Phuoc Tran


Ho Chi Minh City Industry 4.0 technologies in Mechanical Engineering
University of Technology

Data management methods


+ Database Management System
+ File Management System
+ Distributed Data Management System
+ Document Management System
+ Object-oriented Data Management System
+ Metadata Management System

- Manage the data using tools such as Apache HBase or Apache Cassandra
to ensure data consistency, reliability, and security.

PhD. Quang-Phuoc Tran

You might also like