World's most popular travel blog for travel bloggers.

[Answers] Expectation Maximization Algorithm for simple naive Bayesian network

, , No Comments
Problem Detail: 

I am trying to understand the following network A has two children - B & C (aka common cause)

All the variables are binary and can be either 0 or 1. In data values are missing only for some records. Only values that are missing are values A.

How does the Expectation step looks like in this example? Is it P(A|B,C) and then calculating the counts for P(A = 0| B = 0, C=0) ? I am confused with this.

How does the Maximization step looks like in this example?

Asked By : Tesla

Answered By : Tesla

I finally found the solution Expectation step is as follows Calculate new probabilities of data points as follows P(A|B,C) which we break down to

(P(A,B,C)/P(B,C) =

(P(B|C,A) * P(C|A)*P(A)) / (sum of all P(B,C) for all A) =

since A is known then B & C is independent therefore

(P(B|C,A) * P(C|A)*P(A)) / (sum of all P(B,C) for all A) =

which we need to consider 3 binary variables, gives us 8 cases (2^3) lets say for A = 1, B = 0, C= 1

(P(B=0|A=1) * P(C=1|A=1)*P(A=1)) / ( (P(B=0|A=1) * P(C=1|A=1)*P(A=1)) + ( (P(B=0|A=0) * P(C=1|A=0)*P(A=0)) )

= P(A=1| B=0, C=1) = lets call it value val101

we use this value (val101) to update probability of each unknown data point that has B= 0 and C = 1 and A = 1. Then update probability of each unknown data point where A=0, B=0 and C=1 with the (1-val101)

we do that for possible combinations of A, B and C 000 001 010 011 100 101 110 111

(I think this is already an M step - Maximization, but here it goes)

To calculate parameter theta for P(A=0) we do the following count (sum of all probabilities) of each data point where A = 0 and we divide it by the original amount of data points we do the same for the A =1

then we update conditional parameters to update parameter theta of P(B=1|A=0) we take a sum all the probabilities of all data points where A = 0 and B=1 and we divide it by the count (sum all the probabilities) of probabilities of data points where A =0 so its as follows

#(A=0,B=1) / #(A=0) = theta P(B=1|A=0)

and we do that for all the conditional probabilities which are as follows

P(B=1|A=0) P(B=0|A=0) P(C=1|A=0) P(C=0|A=0) P(B=1|A=1) P(B=0|A=1) P(C=1|A=1) P(C=0|A=1)

then you are done

Best Answer from StackOverflow

Question Source :

3.2K people like this

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback