Information Theory, Inference, and Learning Algorithms in J - Ensembles

02 Feb 2015

Unlike in the previous post on 27x27
letter bigrams where we made a joint probability matrix by counting, ensembles
are usually defined by a set of conditional and marginal probabilities. To get
an intuition for this, let’s write out the simple example given in Example 2.3
(p. 25) in J. But first, here is the definition for an ensemble.

An ensemble X is a triple $(x,A_x,P_x)$ where x is an outcome taking on
values from $A_x = \{a_1, …, a_I\}$, with associated probabilities
$P_x = \{p_1, …, p_I\}$

A joint ensemble XY is an ensemble where each outcome is an order pair
(x,y) (also written xy), where $x \in A_x = \{a_1,…,a_I\}$, $y \in A_y = \{b_1,…,b_J\}$,
and $P(x,y)$ is called the joint probability of x and y.

Jo wakes up not feeling well and the doctor orders a test for a
disease. The test is 95% reliable, and 1% of Jo’s age and background
have the disease. If the test is positive, what is the probability
Jo has the disease?

If we define variables disease and test as

disease=0 => Jo doesn’t have the disease

disease=1 => Jo has the disease

test=0 => the test is negative for the disease

test=1 => the test is positive for the disease

then the probabilities given are

$P(test=0 | disease=0) = 5\%$

$P(test=1 | disease=1) = 95\%$

$P(disease=0) = 99\%$

$P(disease=1) = 1\%$

To start, we represent $P(test=j|disease=i) = P_{i,j}$ as a matrix
ptest_disease where rows represent disease and columns represent
test:

and the marginal probability $P(disease=i)$ as a vector

Then we can compute the joint probability by multiplying the two,
since $P(test,disease) = P(test|disease) P(disease)$

Now that we have the joint probability, we can calculate any probability that
we are interested in. To answer the original question, what is
$P(disease=1|test=1)$, we divide each column of joint by it’s sum, since
$P(disease|test) = \frac{P(disease,test)}{P(disease)}$

and we see that $P(disease=1|test=1)$ is 16%. So even though the test is
95% accurate, because it’s a rare disease it’s more likely the test is
giving a false positive than Jo has the disease.

I'm a software engineer currently living in Melbourne, working remotely for the
Psychiatry Neuroimaging Laboratory
in Boston and
a startup in Canada. My interests are data analysis pipelines and inference,
and I'm unduly obsessed with understanding design principles behind concise,
uncomplicated software systems.