Bayesian Networks
Bucket Elimination Algorithm
主講人:虞台文
大同大學資工所
智慧型多媒體研究室
Content
Basic Concept
Belief Updating
Most Probable Explanation (MPE)
Maximum A Posteriori (MAP)
Bayesian Networks
Bucket Elimination Algorithm
Basic Concept
大同大學資工所
智慧型多媒體研究室
Satisfiability
Given a statement of clauses (in disjunction normal form), t
he satisfiability problem is to determine whether there exist
s a truth assignment to make the statement true.
Examples:
1. ( A B C ) (C D) ( A B D) (A D)
A=True, B=True, C=False, D=False Satisfiable
2. ( A B C ) ( B C D) ( B C ) (A C ) (D)
Satisfiable?
Resolution
( q ) ( q) can be true if and only if
can be true.
q TURE ?
q
unsatisfiable
( A B C ) ( B C D) ( B C ) (A C ) (D)
Direct Resolution
Example: A B C
BC D
Given a set of clauses B C and an order d=ABCD
A C
D
Set initial buckets as follows:
A B C
B C BC D
A C D
BucketA BucketB BucketC BucketD
( A B C ) ( B C D) ( B C ) (A C ) (D)
Direct Resolution
Because no empty clause () is resulted, the statem
ent is satisfiable.
How to get a truth assignment?
BC
A B A B C
B B C BC D
A B A C D
BucketA BucketB BucketC BucketD
( A B C ) ( B C D) ( B C ) (A C ) (D)
Direct Resolution
True if A True
C
True/False if A False
A True or False B True D False
BC
A B A B C
B B C BC D
A B A C D
BucketA BucketB BucketC BucketD
Direct Resolution
Queries on Bayesian Networks
Belief updating
P ( X | E) ?
Finding the most probable explanation (mpe)
– Given evidence, finding a maximum probability assignment to t
he rest of variables.
Maximizing a posteriori hypothesis (map)
– Given evidence, finding an assignment to a subset of hypothesi
s variables that maximize their probability.
Maximizing the expected utility of the problem (meu)
– Given evidence and utility function, finding a subset of decision
variables that maximize the expected utility.
Bucket Elimination
The algorithm will be used as a framework for
various probabilistic inferences on Bayesian
Networks.
Preliminary – Elimination Functions
Given a function h defined over subset of
variables S, where X S,
min X h
Eliminate parameter X from h
max X h
mean X h Defined over U = S – {X}.
X h
Preliminary – Elimination Functions
Given a function h defined over subset of
variables S, where X S,
min X h
min X h (u) min X h( x, u)
max X h
max X h (u) max X h( x, u)
1
mean X h mean X h (u)
|X|
X h ( x, u )
X h
X h (u) X h( x, u)
Preliminary – Elimination Functions
Given function h1,…, hn defined over subset o
f variables S1,…, Sn, respectively,
j hj
Defined over U S j
j
j hj
Preliminary – Elimination Functions
Given function h1,…, hn defined over subset o
f variables S1,…, Sn, respectively,
j hj h (u)
j j j
h j (u S j
)
j hj h (u)
j j j
h j (u S j
)
Bayesian Networks
Bucket Elimination Algorithm
Belief Updating
大同大學資工所
智慧型多媒體研究室
Goal
P ( X x | E e) ?
P ( X x, E e)
Normalization
P ( E e) Factor
P(a | g 1) ? P(a, g 1) ?
Basic Concept of Variable Elimination
Example:
P( g , f , d , c, b, a ) P ( g | f ) P( f | b, c) P(d | a, b) P(b | a) P(c | a ) P (a)
A
B C
G
P(a | g 1) ? P(a, g 1) ?
Basic Concept of Variable Elimination
Example:
P( g , f , d , c, b, a ) P ( g | f ) P( f | b, c) P(d | a, b) P(b | a) P(c | a ) P (a)
P (a, g 1)
b , c , d , f , g 1
P ( a , b, c , d , f , g )
P (a )
b , c , d , f , g 1
P( g | f ) P( f | b, c) P (d | a, b) P(b | a) P(c | a)
P (a) P (c | a ) P(b | a) P ( f | b, c) P(d | a, b) P( g | f )
c b f d g 1
P(a | g 1) ? P(a, g 1) ?
Basic Concept of Variable Elimination
P (a, g 1)
P(a ) P(c | a) P (b | a ) P ( f | b, c) P(d | a, b) P ( g | f )
c b f d g 1
(f)
G
P( a) P(c | a ) P(b | a) P( f | b, c )G ( f ) P (d | a, b)
c b f d
(a, b)
D
P (a ) P(c | a) P (b | a )D (a, b) P ( f | b, c)G ( f )
c b f
(b, c)
F
P(a) P(c | a ) P(b | a)D (a, b)F (b, c) P(a) P(c | a)B (a, c)
c b c
B (a, c) (a)
C
P(a )C (a )
P (a, g 1)
P(a ) P (c | a) P (b | a ) P ( f | b, c) P (d | a, b) P ( g | f )
c b f d g 1
Basic Concept of Variable Elimination
BucketG G ( f ) P( g | f )
g 1
BucketD D (a, b) P(d | a, b)
d
BucketF F (b, c) P( f | b, c)G ( f )
f
BucketB B (a, c) P (b | a)D (a, b)F (b, c)
b
BucketC C (a) P(c | a)B (a, c)
c
BucketA P (a, g 1) P(a )C (a)
P (a, g 1)
P(a ) P (c | a) P (b | a ) P ( f | b, c) P (d | a, b) P ( g | f )
c b f d g 1
Basic Concept of Variable Elimination
BucketG P( g | f )
g 1
G ( f )
BucketD P ( d | a, b)
d
D ( a, b)
BucketF P ( f | b, c )
f
G ( f ) F (b, c )
BucketB P(b | a)
b
D (a, b) F (b, c) B (a, c)
BucketC P (c | a )
c
B (a, c) C (a )
BucketA P (a ) C (a ) P (a, g 1)
Basic Concept of Variable Elimination
G ( f ) P( g | f )
g 1
f G(f )
+ 0.1
0.7
f G(f )
+ 0.1
0.7
Basic Concept of Variable Elimination
D ( a , b ) P ( d | a , b )
d
a b D(a,
b)
0 0 1
0 1 1
1 0 1
1 1 1
a b D(a,
b) f G(f )
0 0 1 + 0.1
0 1 1 0.7
1 0 1
1 1 1
Basic Concept of Variable Elimination
F (b, c) P( f | b, c)G ( f )
f
b c F(b, c)
0 0 0.701
0 1 0.610
1 0 0.400
1 1 0.340
0.7
0.1
0.7
0.1
0.7
0.1
0.7
0.1
b c F(b, c) a b D(a,
b) f G(f )
0 0 0.701
0 0 1 + 0.1
0 1 0.610
0 1 1 0.7
1 0 0.400
1 0 1
1 1 0.340
1 1 1
Basic Concept of Variable Elimination
B (a, c) P(b | a )D (a, b)F (b, c)
b
a c B(a, c)
0 0 0.90.701+0.1 0.400=0.6709
0 1 0.90.610+0.1 0.340=0.5830
1 0 0.60.701+0.4 0.400=0.5806
1 1 0.60.610+0.4 0.340=0.5020
a c B(a, c) b c F(b, c) a b D(a, f G(f )
b)
0 0 0.6709 0 0 0.701 + 0.1
0 0 1
0 1 0.5830 0 1 0.610 0.7
0 1 1
1 0 0.5806 1 0 0.400
1 0 1
1 1 0.5020 1 1 0.340
1 1 1
Basic Concept of Variable Elimination
C (a ) P (c | a )B (a, c)
c
a C(a )
1 0.67 0.5806+0.33 0.5020=0.554662
0 0.75 0.6709+0.25 0.5830=0.648925
a C(a ) a c B(a, c) b c F(b, c) a b D(a, f G(f )
b)
1 0.554662 0 0 0.6709 0 0 0.701 + 0.1
0 0 1
0 0.648925 0 1 0.5830 0 1 0.610 0.7
0 1 1
1 0 0.5806 1 0 0.400
1 0 1
1 1 0.5020 1 1 0.340
1 1 1
Basic Concept of Variable Elimination
P (a, g 1) P(a )C (a )
a P(a, g=1)
1 0.30.554662=0.1663986
0 0.70.648925=0.4542475
P(a, g 1)
P(a | g 1)
P ( g 1)
P ( g 1) 0.1663986 0.4542475
0.6206461
a P(a | g=1)
1 0.1663986/0.6206461=0.26811
0 0.4542475/0.6206461=0.73189
Bucket Elimination Algorithm
Complexity
The BuckElim Algorithm can be applied to any ordering.
The arity of the function recorded in a bucket
– the numbers of variables appearing in the processed bucked, excl
uding the bucket’s variable.
Time and Space complexity is exponentially grow with a f
unction of arity r.
The arity is dependent on the ordering.
How many possible orderings for BN’s variables?
Consider the ordering AFDCBG.
A
Determination of the Arity B C
F
D
BucketG P( g | f ) G ( f ) G
g 1
BucketB P(b | a)P(d | a, b) P( f | b, c) B (a, c, d , f ) G
1
b
4
BucketC P(c | a) B (a, c, d , f ) C (a, d , f ) B
c
1 ,3
BucketD
C
C (a, d , f ) D ( a , f )
d 0 ,2
D
BucketF G ( f )D (a, f ) F (a) F
0 ,1
f
BucketA P (a ) F (a) P (a, g 1) A
0
Given the ordering, e.g., AFDCBG.
d
Determination of the Arity
The
Thewidth
widthof
ofaagraph
graphisisthe
the
maximum
maximumwidth
widthofofits
itsnodes.
nodes. w(d) = 4 w*(d) = 4
w(d):
w(d):width
widthof
ofinitial
initialgraph
graph Width of Width of
for node node
forordering
orderingd.d. 1 1
w*(d):
w*(d):width
widthof
ofinduced
inducedgraph
graph G G
for
forordering
orderingd.d. 4 4
B B
A 1 3
C C
B C
0 2
D D
Initial Induced
F Graph
Graph 0 1
D F F
G 0 0
A A
Definition of Tree-Width
Goal: Finding an ordering with smallest induced width.
Greedy heuristic and Approximation methods
NP-Hard
Are available.
Summary
The complexity of BuckElim algorithm is domi
nated by the time and space needed to proce
ss a bucket.
It is time and space is exponential in number
of bucket variables.
Induced width bounds the arity of bucket func
tions.
Exercises
Use BuckElim to evaluate P(a|b=1) with the foll
owing two ordering:
A
B C
1. d1=ACBFDG
2. d2=AFDCBG F
Give the details and make some conclusion. G
How to improve the algorithm?
Bayesian Networks
Bucket Elimination Algorithm
Most Probable Exp
lanation (MPE)
大同大學資工所
智慧型多媒體研究室
MPE
Goal:
x* arg max P( X x | E e) ?
x
evidence
x ( x1 , , xn )
MPE
Goal:
x* arg max P( X x | E e) ?
x
x* arg max P(x, e) ?
x
x* arg max P (x, e)
x
Notations
i
xi
Fi i
x* arg max P (x, e)
x
MPE
PX (x) P(x, e) i 1 P( xi , e | i )
n
Let x n ( x1 , , xn )
P(x*) max PX (x) max PX (x n )
x xn
max i 1 P( xi , e | i )
n
xn
max i 1 P( xi , e | i )
n
x n1 , xn
x* arg max P (x, e)
x
MPE
P (x*) max i 1 P( xi , e | i )
n
x n1 , xn
Some terms involve xn,
some terms not.
n Xn is conditioned by its parents.
P ( xn , e | n )
Xn
Xn conditions its children.
n P( xk , e | xn ,), xk n
x* arg max P (x, e)
x
MPE
P (x*) max i 1 P( xi , e | i )
n
x n1 , xn
max X X F P( xi , e | i ) max P( xn , e | n ) X P( xi , e | i )
x n1 i n xn i n
Not conditioned by x n Itself Conditioned by x n
n
x nappears in these CPT’s
Xn
Fn
n
x* arg max P (x, e)
x
MPE
P (x*) max i 1 P( xi , e | i )
n
x n1 , xn
max X X F P( xi , e | i ) max P( xn , e | n ) X P( xi , e | i )
x n1 i n xn i n
hn ( xUn )
Eliminate variable xn at Bucketn.
max X X F P( xi , e | i ) hn ( xUn )
xn1 i n
Process the next bucket recursively.
max P(a, b, c, d , f , g 1) ?
a ,b , c , d , f
Example
P(a, b, c, d , f , g 1) P( g 1 | f ) P( f | b, c) P( d | a, b) P(b | a) P(c | a ) P(a)
B C
G
max P(a, b, c, d , f , g 1) ?
a ,b , c , d , f
Example Consider ordering ACBFDG
A
BucketG max P ( g | f ) hG ( f )
g 1 B C
BucketD max P (d | a, b) hD (a, b)
d
F
D
BucketF max P ( f | b, c) h ( f ) h (b, c) G
f G F
BucketB max P (b | a )h (a, b) h (b, c) h (a, c)
b D F B
BucketC max P (c | a ) (a, c) (a )
c B C
BucketA max P (a ) hC (a) max P (a, b, c, d , f , g 1)
a a ,b , c , d , f
Bucket Elimination Algorithm
max P(a, b, c, d , f , g 1) ?
a ,b , c , d , f
Exercise Consider ordering ACBFDG
Bayesian Networks
Bucket Elimination Algorithm
Maximum
A Posteriori (MAP)
大同大學資工所
智慧型多媒體研究室
MAP
Given a belief network, a subset of hypothesized
variables A=(A1, …, Ak), and evidence E=e, the
goal is to determine
a* arg max P( A a | E e) ?
a
a* arg max P(a, e) ?
a
Example
A
Hypothesis (Decision)
Variables B C
G
g=1
(b*, c*) arg max P(b, c | g 1) ?
b ,c
MAP
Ordering
d X 1 , X k , X k 1 , , X n
A1 , Ak , X k 1 , , X n
Ak , X n
k 1
Some of them may be observed
d Ak , X n
k 1
MAP
a arg max P(a k | e)
*
k
ak
P (a k , e)
arg max
ak P (e)
arg max P(a k , e)
ak
d Ak , X n
k 1
MAP
a arg max P(a k , e)
*
k
ak
P(a k , e) xn P(a k , x n
k 1 , e)
k 1
xn
n
i 1
P ( xi , e | i )
k 1
n
P (a ) max ak
*
k x nk1 i 1
P ( xi , e | i )
d Ak , X n
k 1
MAP
a arg max P(a k , e)
*
k
ak
Bucket
BucketElimination
Elimination Bucket
BucketElimination
Elimination
for
forMPE
MPE for
forbelief
beliefupdating
updating
n
P (a ) max ak
*
k x nk1 i 1
P ( xi , e | i )
Bucket Elimination Algorithm
Consider ordering CBAFDG
(b*, c*) arg max P(b, c | g 1) ?
Example b ,c
BucketG P( g | f )
g 1
G (f) A
P ( d | a, b)
B C
BucketD D ( a, b )
d
F
BucketF P ( f | b, c )
f
G ( f ) F (b, c )
D
G g=1
BucketA P(c | a) P(b | a) P(a)
a
D (a, b) A (b, c )
BucketB max F (b, c) A (b, c ) b (c)
b
BucketC max b (c) max P (b, c | g 1)
c b ,c
Consider ordering CBAFDG
(b*, c*) arg max P(b, c | g 1) ?
Exercise b ,c
BucketG P( g | f )
g 1
G (f) A
P ( d | a, b)
B C
BucketD D ( a, b )
d
F
BucketF P ( f | b, c )
f
G ( f ) F (b, c )
D
G g=1
BucketA P(c | a) P(b | a) P(a)
a
D (a, b) A (b, c )
BucketB Give
Give the
max F (b, c) A (b, c ) b (c)
b
the
max (c) max P (b, c | g 1)detail
BucketC c b
b ,c detail