Module3 MRobotics Perception
Module3 MRobotics Perception
Page 1 of 52
Outline
• Overview of modern mobile robotic systems
• Perception systems
• Bayes Rule
“Systems engineering is defined as a methodical, multi-disciplinary approach for the design, realization, technical
management, operations, and retirement of a system. A “system” is the combination of elements that function
together to produce the capability required to meet a need. The elements include all hardware, software,
equipment, facilities, personnel, processes, and procedures needed for this purpose; that is, all things required to produce
system-level results.”
Modern mobile robots can be seen/studied as a combination of elements / sub-systems / modules / components:
Page 3
Examples of mobile robotic systems
Page 4
Examples of mobile robotic systems
Page 5
Concept/Def. – Robot perception
“Perception is the process by which the robot uses its …perception is more than sensing. Perception is also the
sensors to obtain information about the state of its interpretation of sensed data in meaningful ways. [2]
environment. For example, a robot might take a camera
image, a range scan, or query its tactile sensors to receive
information about the state of the environment ...” [1]
“Robots' ability to interact with their surroundings is
an essential capability, especially in unstructured
“In robotics, perception is understood as a system that endows the human-inhabited environments. The knowledge of
robot with the ability to perceive, comprehend, and reason about the such an environment is usually obtained through
surrounding environment.”[3] sensors. The study of acquiring knowledge from
sensor data is called robotic perception.”[4]
[1] S. Thrun, W. Burgard and D. Fox. “Probabilistic Robotics”. The MIT Press, 2005.
[2] R. Siegwart, I.R. Nourbakhsh and D. Scaramuzza. “Introduction to Autonomous Mobile Robots”. The MIT Press, 2011.
Page 7
Subsystems of mobile robots
The key components (or sub-systems or modules) of a Mobile Robot are:
Page 8
Perception systems
From a AI/ML perspective the most relevant task in autonomous robotics, including self-driving vehicles, is Perception.
Perception systems have to cope with environment/world understanding ie, acquiring knowledge about the robot
surrounding.
- SENSORS (data)
- SOFTWARE
Both together, they allow autonomous/intelligent vehicles to model, to understand and to react in
response to the surrounding, ever-changing, environment.
Page 9
Perception systems
Main goal: to extract meaningful information from the measurements (data) and/or info (higher-level data) from
exteroceptive* sensors mounted on-board the robot and/or from the ‘infrastructure’.
Perception
Representation
Real World Sensors AI/ML Output
Pre-processing
environment (data) algorithms
Calib./Synch.
How to model/characterize the uncertainties which are inherent to the sensors, data, and consequently the outputs ?
A: by using Probability and statistical theories; Gaussian probability density functions (pdf) are particularly useful.
(Module 5)
* The so called proprioceptive sensors will be not discussed here.
Page 10
Environment Representation: Occup. Grid Mapping
Examples of data representation for environment understanding/modelling
2D occupancy grid
Page 11
LiDAR representation
“Deep Multi-modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges” Page 12
2D occup. grids
Page 13
2D occup. grids
Page 14
Perception systems: feature space
Feature definition: features are recognizable structures of elements in the environment. They usually can be extracted from
measurements and mathematically described. Good features are always perceivable and easily detectable from the
environment* …
… and it is desirable that they are invariant to linear transformations and robust to noise.
* Roland Siegwart, et al. “Introduction to Autonomous Mobile Robots”, Second Edition (2011).
Page 15
Perception systems: {Lidar, Camera} combination
Raw-data Pre-processing Segmentation
Page 17
Perception systems: TP, FP, FN, TN
Page 18
Classification vs Detection vs Semantic/Instance Segmentation
Page 20
Perception systems: TP, FP, FN, TN
Supervised Classification: Basics
Page 21
Detection: Basics
- Overlap / Multiple detections
Detection is much
more difficult
problem
- Position / location
- Scale / size (~
distance to the
sensors)
Further techniques
• Stochastic filter
• Data alignment
• Data association
• Probabilistic output
• Distance to target
Page 22
Example – pedestrian detection
Pedestrian
Minimum
Maximum
𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 𝑓(𝑌1 , 𝑌2 ) Average
𝑓∶ ? Product
Bayes rule
DS inference
Fuzzy
𝑌1 = 0.923 𝑌2 = 0.895 Learning-rule
…
𝑓𝑚𝑖𝑛 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 0.895
Y1
Processing 𝑓𝑚𝑎𝑥 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 0.923
f
Sensor 1 Data Y_fusion 𝑓𝑎𝑣𝑒 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 0.909
Fusion
Processing
Y2
𝑓𝐵𝑎𝑦𝑒𝑠 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = ? ?
Sensor 2
Page 23
Something about uncertainty ...
u0 u1
Page 24
Uncertainty - noise
Consider the measurements (or observations) of a time-invariant variable (i.e., a parameter) 𝑥
𝑧(𝑗) = 𝑥 + 𝑤(𝑗) , 𝑗 = 0, 1, … , 𝑘
Parameter 𝑥 designates a variable (scalar or vector-valued) that is usually time invariant. However, with some abuse of
language, when
Observations 𝑥 changes
𝑧(𝑗) with time
are assumed to bewe canin
made designate it as “time-varying
the presence parameter”.
of disturbances/noise 𝑤(𝑗)But,
. its time variation must be “slow”
compared to the state-variables of a system.
Estimation: is the process of inferring the value of a quantity (variable/parameter) of interest from uncertain
observations/measurements.
Estimation can be understood as a process for information/data extraction and enhancement, based on measurements
(observations) corrupted by noise/disturbance, with the purpose of maximizing the knowledge about a parameter, variable
or state.
Page 25
Example – Batch estimation
Estimator can be defined as a function
ො 𝑍𝑘 ]
𝑥ො(𝑘) ≜ 𝑥[𝑘,
Where the measurements are denoted in a the compact form
𝑍 𝑘 ≜ {𝑧(𝑗) }𝑘𝑗=1
𝑧(𝑗) = 𝑥 + 𝑤(𝑗) , 𝑗 = 0, 1, … , 𝑘 = 20
𝑧(𝑗) → measurements with noise: red-cross points 5.30 4.56 4.90 5.03 4.92 5.16 5.13 4.79 4.93 4.52 5.48 4.67 4.61 4.87 4.70 4.99 4.84 5.45 5.42 4.55
Page 26
Example – Batch estimation 𝑥ො(1) = 𝑧 1 = 5.30 Initial condition
Some values: 1
𝑘+1 𝑥ො(2) = 𝑧 1 + 𝑧 2 = 4.93
Simple (sample mean) 1 2
Batch estimator: 𝑥ො(𝑘+1) = 𝑧(𝑖) 1
𝑘+1 𝑥ො(3) = 𝑧 1 + 𝑧 2 + 𝑧(3) = 4.92
𝑖=1 3
𝑥ො(4) = 4.95
… 𝑥ො(20) = 4.94
xest = 5.30 4.93 4.92 4.95 4.94 4.98 5.00 4.97 4.97 4.92
4.97 4.95 4.92 4.92 4.90 4.91 4.91 4.94 4.96 4.94
Page 27
Simple Recursive estimation
𝑘+1
1
Simple Batch estimator: 𝑥ො(𝑘+1) = 𝑧(𝑖)
𝑘+1
𝑖=1
Page 28
Basic recursive estimator
𝑘+1
1 1
𝑥ො(𝑘+1) = 𝑧(𝑖) (1) 𝑥ො(𝑘+1) = 𝑥ො(𝑘) + [𝑧 𝑘+1 − 𝑥ො(𝑘) ] (2)
𝑘+1 𝑘+1
𝑖=1
Now, assume the number of measurements is larger, let’s say n=100000… therefore,
the summation will cost more computation time.
Speed: in this example, the Batch estimator takes about 143 times
more time than the Recursive implementation.
Less memory allocation: we do not need to store all the
measurements for i=1,2,…,n
Cont.
Page 29
Basic recursive estimator
1 1 1
𝑥ො(𝑘+1) = 𝑘𝑥ො(𝑘) + 𝑧(𝑘+1) = 𝑘𝑥ො(𝑘) + 𝑧(𝑘+1) − 𝑥ො 𝑘 + 𝑥ො(𝑘) = 𝑥ො(𝑘) + 𝑧 − 𝑥ො (2)
𝑘+1 𝑘+1 𝑘 + 1 (𝑘+1) 𝑘
Page 30
Bayes rule
The joint probability of two events A and B is given by 𝑃 𝐴, 𝐵 which is equivalent to 𝑃 𝐴, 𝐵 == 𝑃 𝐴 𝐚𝐧𝐝 𝐵
ie, events A and B occurred simultaneously.
It is easy to understand that 𝑃 𝐴, 𝐵 = P(B, A) (1)
The conditional probability of A given B is denoted by 𝑃 𝐴|𝐵
The expression for the joint and the conditional probabilities are
𝑃 𝐴, 𝐵 = 𝑃 𝐴|𝐵 𝑃(𝐵) (2)
Based on equ. (1), equations (2) and (3) yield 𝑃 𝐴|𝐵 𝑃 𝐵 = 𝑃 𝐵 𝐴 𝑃(𝐴)
𝑃 𝐵, 𝐴 = 𝑃 𝐵|𝐴 𝑃(𝐴) (3)
𝑃 𝐵𝐴 𝑃 𝐴
∴ 𝑃 𝐴|𝐵 =
𝑃 𝐵
Bayes’ formula can be expressed as
𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 . 𝑃𝑟𝑖𝑜𝑟 Bayes’ rule
𝑃𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟 =
𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒
Page 31
Bayes rule - Example
In our example, we want to identify (using Bayesian inference) a target aircraft. We can express the Bayes’ rule as
If there are many events of interest, then the denominator acts as a normalization to guarantee 𝑃(𝑥𝑖 |𝑑𝑎𝑡𝑎) = 1
𝑖
𝑃 𝑑𝑎𝑡𝑎 𝑥𝑗 𝑃(𝑥𝑗 )
Therefore, the posterior becomes 𝑃(𝑥𝑗 |𝑑𝑎𝑡𝑎) =
σ𝑖 𝑃(𝑑𝑎𝑡𝑎|𝑥𝑖 ) 𝑃(𝑥𝑖 )
Suppose the sensors can supply data/information about the target-aircraft we want to identify. The possible jet-fighters the
sensors can identify are: F-22, F-35, Su-57, F/A-18, MiG-X. So, we have n=5 variables of interest
𝑥1 =“F-22” 𝑥2 =“F-35” . . . 𝑥5 =“MiG”
The sensors give, for the current time-instant, the following information concerning the target-type conditional probabilities
𝑃 𝑑𝑎𝑡𝑎 𝑥1 = 0.21 𝑃 𝑑𝑎𝑡𝑎 𝑥2 = 0.08 𝑃 𝑑𝑎𝑡𝑎 𝑥3 = 0.53
𝑃 𝑑𝑎𝑡𝑎 𝑥4 = 0.15 𝑃 𝑑𝑎𝑡𝑎 𝑥5 = 0.03
Continued
Page 32
Bayes rule - Example
Some intel provided the following a-priori probabilities regarding the jet-fighters, they are
𝑃(𝑥1 )= 0.15 𝑃(𝑥2 )= 0.25 𝑃(𝑥3 )= 0.25 𝑃(𝑥4 )= 0.15 𝑃(𝑥5 )= _______
By combining all the data/information we have, the posterior probability can be calculated as follows
𝑃 𝑑𝑎𝑡𝑎 𝑥1 𝑃(𝑥1 )
𝑃(𝑥1 |𝑑𝑎𝑡𝑎) =
σ𝑛𝑖=1 𝑃 𝑑𝑎𝑡𝑎 𝑥𝑖 𝑃( 𝑥𝑖 ) Example in Python
Pdx = np.array([0.21, 0.08, 0.53, 0.15, 0.03])
Prior = np.array([0.15, 0.25, 0.25, 0.15, 0.2])
𝑃(𝑥2 |𝑑𝑎𝑡𝑎) = n = 5
Post = np.zeros(n)
Px = sum(Pdx * Prior)
print(Px)
𝑃(𝑥3 |𝑑𝑎𝑡𝑎) = i = 0
for i in range(n):
Post[i] = (Pdx[i]*Prior[i])/Px
𝑃(𝑥4 |𝑑𝑎𝑡𝑎) = print(Post)
𝑃(𝑥5 |𝑑𝑎𝑡𝑎) =
Page 33
Example – pedestrian detection
Pedestrian
Minimum
Maximum
𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 𝑓(𝑌1 , 𝑌2 ) Average
𝑓∶ ? Product
Bayes rule
DS inference
Fuzzy
𝑌1 = 0.923 𝑌2 = 0.895 Learning-rule
…
𝑓𝑚𝑖𝑛 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 0.895
Y1
Processing 𝑓𝑚𝑎𝑥 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 0.923
f
Sensor 1 Data Y_fusion 𝑓𝑎𝑣𝑒 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = 0.909
Fusion
Processing
Y2
𝑓𝐵𝑎𝑦𝑒𝑠 ∴ 𝑌𝑓𝑢𝑠𝑖𝑜𝑛 = ? ?
Sensor 2
Page 34
Basic Bayes fusion rule - example
Let’s say
correspond to probabilities (degree of confidence = Likelihood) that the
𝑌1 = 0.923 𝑌2 = 0.895
object/target is a pedestrian: “ped”
Our problem involves 2 categories (ped. vs non-ped.), and because the total probability must be 1, therefore the probability of
been non-pedestrian is simply 1 – P(pedestrian).
0.923 ∗ 0.895
𝑓𝐵𝑎𝑦𝑒𝑠 = = 0.9903
0.923 ∗ 0.895 + 0.077 ∗ 0.105
Page 35
Occupancy mapping - derivation
The basic motivation: representing a map of the environment as a set of cells (belonging to a
stationary Grid), where each cell is modelled as a ‘binary’ ( {occupied, free} ) random variable.
Let the r.v. 𝑋 = {𝑥, 𝑥}ҧ represent the state of a cell, where 𝑥: 𝑜𝑐𝑐𝑢𝑝𝑖𝑒𝑑, 𝑥:ҧ 𝑓𝑟𝑒𝑒. The key idea is to calculate
the probability of a cell is occupied or non-occupied given the measurements 𝑧1:𝑡
Using Bayesian formulation, and starting with the posterior for the cell being ‘occupied’ 𝑥, yields:
𝑝 𝑧𝑡 |𝑥 𝑝(𝑥|𝑧1:𝑡−1 )
𝑝 𝑥 𝑧1:𝑡 =
𝑝(𝑧𝑡 |𝑧1:𝑡−1 )
Page 36
Occupancy mapping - derivation
The equation can be re-written, using Baye’s rule again, as:
Bel == Belief
Finally, to mitigate numerical issues, Log is used: 𝑝 𝑥|𝑧𝑡 𝑝(𝑥)ҧ
𝐵𝑒𝑙𝑡 𝑥 = 𝑙𝑜𝑔 + 𝑙𝑜𝑔 + 𝐵𝑒𝑙𝑡−1 (𝑥)
𝑝 𝑥|𝑧
ҧ 𝑡 𝑝(𝑥)
Page 37
Reliable AI-ML and robotic perception
Trustworthy AI and Robotics
+ Explainable AI: methods and techniques for making AI systems more transparent and
understandable to humans.
+ Ethical considerations in AI: addressing the ethical implications of AI, such as bias,
privacy, and autonomy.
+ Safety and security in AI and robotics: exploring the risks and challenges of AI and
robotics, and methods for mitigating them.
Page 38
XAI Accuracy vs. interpretability for different machine learning models, from [*].
From [**]
• Interpretability and explainability have escaped a clear universal definition
• Other terms that are synonymous to interpretability: intelligibility, and understandability
• More recently (XAI): is closely tied with interpretability; and many authors do not differentiate between the two
• [***] interpretable ML focuses on designing models that are inherently interpretable; whereas explainable ML tries to
provide post hoc explanations for existing black box models
[*] Plamen P. Angelov, E.A. Soares, R. Jiang, N. I. Arnold, and P. M. Atkinson. “Explainable articial intelligence: an analytical review.” WIREs Data Mining and Knowledge Discovery, 2021.
[**] R. Marcinkevics, Julia E. Vogt. “Interpretability and Explainability: A Machine Learning Zoo Mini-tour”. ArXiv, 2023.
[***] C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead,” Nature Machine Intelligence, 2019. Page 39
Machine learning – DL: undesirable aspects
• Lack of proper Uncertainty quantification
Page 40
Machine learning – DL: undesirable aspects
Why probability is important in ML-based perception for robotics domain?
• Most of the modern deep learning (DL) algorithms, and available software
packages, tend to lack explainability in terms of probability
[*] “Hands-on Bayesian Neural Networks - a Tutorial for Deep Learning Users”
LV Jospin; et. al. – 2020
https://arxiv.org/abs/2007.06823
Page 41
Reliable ML applied to robotic perception
Calibration of ML/DL models
[*]
“Real-world applications of machine learning (ML) systems require a
thorough look into the reliability of the learning models and consequently
to their uncertainty calibration (also referred as confidence calibration or
simply calibration).
[*]
P.Conde, C.Premebida (2022). “Adaptive-TTA: accuracy-consistent weighted test time augmentation method for the
uncertainty calibration of deep learning classifiers”. In. Proc. 33rd British Machine Vision Conference (BMVC).
Page 42
Reliable ML applied to robotic perception
SOTA on object recognition and detection use deep architectures; DNNs provide normalized prediction scores (the outputs)
via a SoftMax or Sigmoid layer i.e., the prediction values are in the interval of [0, 1].
Usually, such models/architectures are implemented through deterministic neural networks thus, the prediction itself does
not consider uncertainty for the predict class of an object during the decision-making.
Therefore, evaluating the prediction confidence or uncertainty is crucial in decision-making whereas an erroneous decision…
Calibration acts directly in the network output prediction (post-hoc calibration*), while regularization aims at penalizing
network weights through a variety of methods, adding parameters or terms directly to the cost/loss function.
𝑃).
Intuitively, the idea of calibration can be formulated as follows: let ℎ to be a ML model ℎ 𝑋 = (𝑌,
Considering a distribution generated over the 𝐾 possible classes of the model for a given input 𝑋, where 𝑌 is the predicted
class with an associated predicted confidence 𝑃.
where the probability is over a joint distribution. The expression above can be better understood by a toy example [*]:
“given 100 predictions, each with confidence of 0.8, we expect that 80 should be correctly classified.”
Thus, for every subset of predicted samples of a given class with score values equal to 𝑆, the proportion of samples that
actually belongs to that class is 𝑆.
[*] Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. (2017). “On calibration of modern neural networks”. In ICML. Page 44
Reliable ML applied to robotic perception
[*]
“perfectly calibrated models are those for which the predicted confidence for each sample is equal to the model
accuracy” …
“an over-confident model tends to yield predicted confidences that are larger than its accuracy,
whereas an underconfident model displays lower confidence than the model’s accuracy.”
The calibration algorithm is an approximation process that depends on a calibration measure, which
can be obtained by separating the predictions into multiple bins, as Reliability Diagram.
The scores (predicted values) are grouped into M bins (histogram) in reliability diagrams.
Each example (classification score of an object) is allocated within a bin according to the maximum prediction value
(prediction confidence).
[*] Liu, B., Ayed, I. B., Galdran, A., and Dolz, J. (2022). “The devil is in the margin: Margin-based label smoothing for network calibration”. In CVPR. Page 45
Reliable ML applied to robotic perception
Reliability Diagram
[*] “Typically, post-calibration predictions are
analysed in the form of reliability diagram
representations, which illustrate the
relationship of the model’s prediction scores
regarding the true correctness
likelihood/probability.
[*] G Melotti, C Premebida, JJ Bird, DR Faria, N Gonçalves (2022). “Reducing Overconfidence Predictions in Autonomous
Driving Perception”. IEEE Access.
Page 46
Reliable ML applied to robotic perception
Reliability Diagram – toy example
i=0 1 2 3 4 5 6 7 8 9
𝑃 𝑦𝑖 = 0 𝑥𝑖
0.1 0.8 0.3 0.6 0.2 0.9 0.8 0.2 0.5 0.1
𝑃 𝑦𝑖 = 1 𝑥𝑖 0.9 0.2 0.7 0.4 0.8 0.1 0.2 0.8 0.5 0.9
Partitioned sets
Set1 (𝑖 = 1, 5, 6) -> (0.2 , 0.1 , 0.2)
Set2 (𝑖 = 3, 8) -> (0.4 , 0.5)
Set3 (𝑖 = 0, 2, 4, 7, 9) -> (0.9 , 0.7, 0.8,
0.8, 0.9)
[Partially @Credits] Xiang Jiang (2020); “A brief introduction to uncertainty calibration and reliability diagrams”, online:
https://towardsdatascience.com/introduction-to-reliability-diagrams-for-probability-calibration-ed785b3f5d44
Page 47
Reliable ML applied to robotic perception
Reliability Diagram – toy example
For each Kth subset, two estimates are computed: (a) average of the predicted probabilities,
(b) the relative frequency of positive examples (normally the Accuracy in ML applications).
Sets: 1 2 3
Average 0.17 0.45 0.82
predictions
Relative Freq. 1/3 0.50 0.80
of “1”
[@Credits] Xiang Jiang (2020); “A brief introduction to uncertainty calibration and reliability diagrams”, online: Page 48
https://towardsdatascience.com/introduction-to-reliability-diagrams-for-probability-calibration-ed785b3f5d44
Reliable ML applied to robotic perception
ECE – Expected Calibration Error
Expected Calibration Error, Overconfidence Error, Max. calib. Error, …
Notations
Predictions/probabilities from a model are grouped into M interval bins
of equal size
𝑦ᵢ and 𝑦ො𝑖 are true label vector and prediction vector, respectively
The accuracy and confidence of Bₘ are defined as
𝑝Ƹ 𝑖 is the confidence/“probability” (winning score) of sample i
Page 49
ECE – Expected Calibration Error Maximum Calibration Error (MCE):
The Expected Calibration Error (ECE) is then defined as: Overconfidence Error (OE)
P.Conde, C.Premebida. “Adaptive-TTA: accuracy-consistent weighted test time augmentation method for the uncertainty calibration of deep learning classifiers”. In. BMVC, 2022.
Page 51
THANK YOU
Questions?
Page 52