Analytic description of "Think Bayes"

Introduction

作者: Allen B. Downey
出版社/メーカー: Oreilly & Associates Inc
発売日: 2013/10/04
メディア: ペーパーバック
この商品を含むブログを見る

The book "Think Bayes" solves various problems with Bayesian inference without using mathematical (analytic) calculations. Instead of mathematics, it uses python codes to achieve the result. It gives a good insight on what's happening behind the mathematics. However, for those who have a mathematical background, analytic expressions may give yet another insight and help to understand. So in this article, I'll show analytic calculations on some example problems in the book.

Chapter3 - The locomotive problem

The prior distribution is:

$P(N)=\frac{1}{N_\max}\,\,(N=1,\cdots,N_{\max})$

where $N$ is the number of locomotives the railroad owns, and $N_{\max}$ is the upper bound of the possible numbers.

If you observed a locomotive with the number $D$ , the likelihood is:

$P(D\mid N)=\begin{cases} \frac{1}{N} & (D \le N) \\ 0 & (D>N) \end{cases}$

Then the posterior distribution becomes:

$P(N\mid D) \propto P(D\mid N)P(N) = \begin{cases} \frac{1}{NN_\max} & (D \le N) \\ 0 & (D>N) \end{cases}$

The normalization constant is:

$Z=\sum_{N=D}^{N_\max}\frac{1}{NN_\max} \sim \frac{1}{N_\max}\int_D^{N_\max}\frac{1}{x}\,dx = \frac{1}{N_\max}\log\frac{N_\max}{D}$

Hence,

$P(N\mid D)=\frac{P(D\mid N)P(N)}{Z} \sim \begin{cases} \frac{1}{\log \frac{N_\max}{D}}\times\frac{1}{N} & (N \ge D) \\ 0 & (N \lt D) \end{cases}$

Especially, when $N_\max = 1000, D=60, N=60$ ,

$P(N\mid D) \sim \frac{1}{\log\frac{1000}{60}}\times\frac{1}{60} \fallingdotseq 0.0059$

This matches with the result of Figure 3-1.

As another case, if you observed three locomotives with the number $D_1, D_2, D_3\,\,(D_1\le D_2\le D_3)$ , the likelihood is:

$P(D_1,D_2,D_3\mid N)=\begin{cases} \frac{1}{N^3} & (D_3 \le N) \\ 0 & (D_3>N) \end{cases}$

Then the posterior distribution becomes:

$P(N\mid D_1,D_2,D_3) \propto P(D_1,D_2,D_3\mid N)P(N) = \begin{cases} \frac{1}{N^3N_\max} & (D_3 \le N) \\ 0 & (D_3>N) \end{cases}$

The normalization constant is:

$Z=\sum_{N=D_3}^{N_\max}\frac{1}{N^3N_\max} \sim \frac{1}{N_\max}\int_{D_3}^{N_\max}\frac{1}{x^3}\,dx$

　 $= \frac{1}{2N_\max}\left(\frac{1}{D_3^2}-\frac{1}{N_\max^2}\right) = \frac{1}{2N_\max}\frac{N_\max^2-D_3^2}{N_\max^2D_3^2}$

Hence,

$P(N\mid D_1,D_2,D_3)=\frac{P(D_1,D_2,D_3\mid N)P(N)}{Z} \sim \begin{cases} \frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\times\frac{1}{N^3} & (N \ge D_3) \\ 0 & (N \lt D_3) \end{cases}$

Now, the mean of the posterior is:

$\sum_{N=D_3}^{N_\max}\left(N\times\frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\frac{1}{N^3}\right) \sim\frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\int_{D_3}^{N_\max}\frac{1}{x^2}\,dx$

　 $=\frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\left(\frac{1}{D_3}-\frac{1}{N_\max}\right) =\frac{2N_\max D_3}{N_\max+D_3}$

Especially, when $N_\max = 500, 1000, 2000$ and $D_3=90$ ,

$N_\max=500:\, \frac{2\times 500\times 90}{500+90} \fallingdotseq 152.5$

$N_\max=1000:\, \frac{2\times 1000\times 90}{1000+90} \fallingdotseq 165.1$

$N_\max=2000:\, \frac{2\times 2000\times 90}{2000+90} \fallingdotseq 172.2$

This matches with the result in the book.

From this calculation, you can see that the posterior depends only on the largest number you observed.

Chapter6 - The Price is Right problem

We will calculate from Player1's view point with the following notation.

$x_t$ : Actual price
$x_g$ : Guessed price
$x_b$ : Bid price
$x^{(2)}_t$ : Actucal price of the opponent (Player2)
$x^{(2)}_b$ : Bid price of the opponent (Player2)

We have the following data from past results.

$p_1(x_t)$ : Distribution of actual price for Player1
$p_2(x^{(2)}_t)$ : Distribution of actual price for Player2
$D_1 = \{x_t-x_b\}$ : Set of price differences for Player1
$D_2 = \{x^{(2)}_t-x^{(2)}_b\}$ : Set of price differences for Player2

Now we make some assumptions.

First, Player1's guess contains some error around the actual price. We assume the gaussian distribution with mean $x_t$ and the same variance $\sigma^2$ as the past price differences. That is:

$p(x_g\mid x_t) = N(x_g\mid x_t, \sigma^2)=N(x_t-x_g\mid 0, \sigma^2)$

where $\sigma^2$ is a sample variance of $D_1$ and $N(x\mid\mu,\sigma^2)$ is a gaussian distribution.

Then using $p_1(x_t)$ as a prior distribution of the actual price, and using guessed price as an observed data, the posterior distribution becomes:

$p_1(x_t\mid x_g) = \frac{1}{Z} N(x_t-x_g\mid 0, \sigma^2)p_1(x_t)$

Where $Z$ is a normalization constant. Based on this posterior distribution of the actual price, Player1 needs to take some strategy to decide the bid price $x_b$ because there are the following trade-offs. (It's not wise to simply bid the maximum likelihood price.)

1. If $x_b \sim x_t$ , the probability of overbid becomes large.

2. If $x_b \ll x_t$ , the probability of losing to Player2 (Player2 makes more accurate bid) becomes large.

So, assuming that Player1's bid is $x_b$ , we calculate the probability of winning as:

$P_{\rm win} = P_0 + P_1(x_t-x_b)$

where $P_0$ is a probability that Player2 overbids which is a constant inferred from $D_2$ . $P_1(x_t-x_b)$ is a probability that Player2 makes less accurate bids, that is, $x^{(2)}_t-x^{(2)}_b > x_t - x_b$ . This is also inferred from $D_2$ for a fixed value of $x_t-x_b$ .

Hence the expected gain with a bid $x_b$ for some fixed actual price $x_t$ can be calculated as:

$g(x_b\mid x_t) = \begin{cases} 0 & (x_t < x_b) \\ x_t \times \left\{P_0+P_1(x_t-x_b)\right\} & (x_t \ge x_b) \end{cases}$

Finally, using the posterior distribution of the actual price, the expected gain (without knowing the true price) is:

$G(x_b) = \int_0^\infty g(x_b\mid x_t)p(x_t\mid x_g)\,dx_t$

　 $= \int_{x_b}^\infty x_t \left\{P_0+P_1(x_t-x_b)\right\}\frac{1}{Z}N(x_t-x_g\mid 0, \sigma^2)p_1(x_t)\,dx_t$

As an example, we'll have a rough estimation for $x_g = 20000$ and $0 < x_b < 5000$ . (According to the result in the book (Figure 6-4), the expected gain for this range will be constant around 7500.)

From the numeric result of the posterior $N(x_t-20000\mid 0, \sigma^2)p_1(x_t)$ (Figure 6-3 in the book), this has a peak around $x_t=25000$ . Hence,

$G(x_b) \sim 25000 \times \left\{P_0+P_1(25000-x_b)\right\}$

From the inferred probability of Player2's error (Figure 6-2 in the book), we can see the fact that $P_0 = 0.3$ and $P_1 \sim 0$ (because Player2 doesn't make an error larger than 20000, and our assumption is $25000-x_b>20000$ ). As a result:

$G(x_b) \sim 25000 \times 0.3 = 7500$

This matches with the expected result. Yeah!

Chapter12 - Interpreting SAT scores

Before going into analysis, let's recap some basics of the beta distribution. The beta distribution gives the probabilistic distribution of variable $\mu$ within the range $0\le \mu\le 1$ with two parameters $a, b$ . Suppose that we repeat some independent experiment which gives one of two results A or B for each trial (like a coin flip). The possibility $\mu$ of having A from a single trial is unknown, but if the number of A is $m$ and the number of B is $l$ , the distribution of $\mu$ can be estimated as ${\rm Beta}(\mu\mid a=m+1,b=l+1)$ .

For example, if $m=l=1$ , the distribution of $\mu$ is ${\rm Beta}(\mu\mid a=2,b=2)$ which has a smooth mountain like shape with a peak at $\mu=0.5$ .

The mathematical definition is given as:

${\rm Beta}(\mu\mid a,b)=\frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)}\mu^{a-1}(1-\mu)^{b-1}$

If we use the beta distribution for a prior of variable $\mu$ , and likelihood is given by the binomial form $L \propto \mu^n(1-\mu)^{M-n}$ , the posterior becomes the beta distribution again. This can be easily understood from the following calculation.

$L \times {\rm Beta}(\mu\mid a,b) \propto \mu^{a+n-1}(1-\mu)^{b+(M-n)-1}$

This indicates that the posterior is given by ${\rm Beta}(\mu\mid a+n,b+(M-n))$

Now, let's go ahead with the problem. We'll evaluate the possibility $\mu$ of giving a correct answer to one question. From the past results (Figure 12-1 in the book), the mean of $\mu$ is around 0.5. So we assume that prior of $\mu$ is given by ${\rm Beta}(\mu\mid a=2,b=2)$ .

If a student has the ability $\mu$ , what is the probability density $L$ that he/she gives correct answers to $n$ questions from $M$ questions in total? It is given by the binomial distribution, that is, $L \propto \mu^n(1-\mu)^{M-n}$ .

Hence, form the discussion about the beta distribution, the posterior of $\mu$ is given by ${\rm Beta}(\mu\mid a=2+n,b=2+(M-n))$ .

Based on this result, we can compare the ability of two students with the following data.

　Total number of questions: $M$

　The number of correct answers for Student1: $n=N_1$

　The number of correct answers for Student2: $n=N_2$

The joint probability density of Student1's ability $\mu_1$ and Student2's ability $\mu_2$ is given by:

$P(\mu_1, \mu_2) = {\rm Beta}(\mu_1\mid a=2+N_1,b=2+(M-N_1))\times{\rm Beta}(\mu_2\mid a=2+N_2,b=2+(M-N_2))$ .

We can draw a heatmap (like Figure 12-5 in the book) using python for some specific cases. We take the following three cases:

$(M, N_1, N_2) = (10,8,6),\,(50,40,30),\,(100,80,60)$

The ratio of correct answers (score of the exam) is the same for all cases. But you can see a difference from the actual heatmap below:

P1 and P2 is a ratio of the total probability in each triangle. P1 is the probability that Student1 gets better score than Student2 in the next exam. It shows that we have more confidence in their abilities if students give answers to more questions.

Here's the python code to draw the heatmap.

import matplotlib.pyplot as plt
from pandas import DataFrame
import numpy as np
from scipy.stats import beta

# Resolution
R = 60

# (Max, Student1, Student2)
Dataset = [(10,8,6), (50,40,30), (100,80,60)]

def post(m, k, p):
    return beta(a=2+k,b=2+m-k).pdf(p)

def drawmap(d, subplot):
    M,N1,N2 = d
    X = np.linspace(0,1,R)
    Y = np.linspace(0,1,R)
    field = DataFrame(np.zeros(shape=(len(X),len(Y))))
    p1 = 0.0
    p2 = 0.0
    for x, xval in enumerate(X):
        for y, yval in enumerate(Y):
            p = post(M, N1, xval)*post(M, N2, yval)
            field.ix[y,x] = p
            if x == y:
                p1 += p * 0.5
                p2 += p * 0.5
            elif x > y:
                p1 += p
            else:
                p2 += p
    z = p1 + p2
    p1 = p1 * 100 / z
    p2 = p2 * 100 / z

    subplot.set_title("Number of questions: %d" % M)
    subplot.set_xlabel("Student1")
    subplot.set_ylabel("Student2")
    subplot.imshow(field.values, 
        extent=(0,1,0,1), origin='lower', interpolation='nearest')
    subplot.plot([0,1], color='white')
    subplot.text(0.1, 0.9, 'P2 = %2.0f%%' % p2,
        horizontalalignment='left', verticalalignment='top',
        bbox={'facecolor':'white', 'alpha':1, 'pad':10})
    subplot.text(0.9, 0.1, 'P1 = %2.0f%%' % p1,
        horizontalalignment='right', verticalalignment='bottom',
        bbox={'facecolor':'white', 'alpha':1, 'pad':10})

# Main
if __name__ == '__main__':
    fig = plt.figure()
    for i, d in enumerate(Dataset):
        subplot = fig.add_subplot(1,3,i+1)
        drawmap(d, subplot)
    fig.show()

Chapter 14 - The Geiger counter problem

First we calculate the posterior of an emission count $n$ under a fixed emission rate $r$ . In other words, all probabilities are conditioned with $r$ .

The prior of $n$ is given by the Poisson distribution:

$P(n\mid r) = \frac{r^ne^{-r}}{n!}$ --- (1)

The likelihood of counting $k$ particles is given by the Binomial distribution:

$P(k\mid n,r) = {}_nC_kf^k(1-f)^{n-k}$ --- (2)

Hence, the posterior of $n$ becomes:

$P(n\mid k,r) \propto P(k\mid n,r)P(n\mid r) = \frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k}$ --- (3)

Then we calculate the posterior of $r$ .

The prior of $r$ is a uniform distribution between $0 \lt r \lt 500$ .

$P(r) = \frac{1}{500}\,\,(0\lt r \lt 500)$

The likelihood of counting $k$ particles is given by marginalizing with $n$ :

$P(k\mid r)=\sum_{n=0}^\infty P(k\mid n,r)P(n\mid r) = \sum_{n=0}^\infty \frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k}$ 　(from (1)(2))

Hence, the posterior of $r$ becomes:

$P(r\mid k) \propto P(k\mid r)P(r) =\frac{1}{500}\sum_{n=0}^\infty \frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k}$ (for $1 \le r \le 500$ )

　 $\sim \frac{1}{500}\sum_{n=0}^{500}\frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k}$ --- (4)

where we cut off the summation by $n=500$ for the sake of numeric calculation performed later.

Finally, the posterior of $n$ which is marginalized with $r$ is:

$P(n\mid k) =\int_0^{500} P(n\mid k,r)P(r\mid k)\,dr \sim \sum_{r=1}^{500} P(n\mid k,r)P(r\mid k)$ --- (5)

The result (3), (4) and (5) can be evaluated with a numeric calculation using python as below.

Here's the python code to draw it.

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import binom, poisson

f = 0.1
k = 15

def postn_conditioned_r(ns, r): # (3)
    result = np.zeros(500)
    ns = np.arange(1, 501)
    result = poisson(r).pmf(ns) * binom(p=f,n=ns).pmf(k)
    result /= result.sum()
    return result
    
def calc_postr(rs):             # (4)
    result = np.zeros(500)
    for n in range(1, 501):
        result += poisson(rs).pmf(n) * binom(p=f,n=n).pmf(k)
    result /= result.sum()
    return result

def calc_postn(ns, post_r):     # (5)
    result = np.zeros(500)
    for r in range(1, 501): 
        result += postn_conditioned_r(ns, r) * post_r[r-1]
    result /= result.sum()
    return result

# Main
if __name__ == '__main__':
    fig = plt.figure()
    subplot = fig.add_subplot(1,1,1)

    rs = np.arange(1, 501)
    post_r = calc_postr(rs)
    subplot.plot(rs, post_r, label="Posterior r")

    ns = np.arange(1, 501)
    post_n = calc_postn(ns, post_r)
    subplot.plot(ns, post_n, label="Posterior n")

    subplot.legend(loc=0)
    fig.show()

Chapter15 - Lions and tigers and bears

First, we calculate the prevalence conditioned with $n$ .

${\mathbf p}=(p_1,\cdots,p_n)$ : Prevalence of each animal.

where $\sum_{i=1}^np_i=1$

The prior of ${\mathbf p}$ is given by the Dirichlet distribution with parameter ${\mathbf 1}_n = (1,\cdots,1)$ :

$P({\mathbf p}\mid n) = D({\mathbf p}, {\bf 1}_n)$

Likelihood of finding ${\mathbf d}=(d_1,d_2,d_3)$ animals, say, $d_1$ lions, $d_2$ tigers and $d_3$ bears, is:

$P({\mathbf d}\mid{\mathbf p},n) = \begin{cases} \sum_{i,j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3} & (n \ge 3) \\ 0 & (n \lt 3) \end{cases}$

where $N=d_1+d_2+d_3$

The tricky summation comes from the fact that we don't know which of $n$ animals correspond to what we found (lions, tigers and bears). Hence we count all possible combinations.

Then the posterior becomes:

$P({\mathbf p}\mid {\mathbf d},n) = \frac{1}{Z}P({\mathbf d}\mid{\mathbf p},n)P({\mathbf p}\mid n) =\frac{1}{Z}\sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)\,\,(n \ge 3)$ --- (1)

where $Z$ is a normalization constant depending on ${\mathbf d},n$ . Explicitly:

$Z=\int\sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p} =\sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}\int p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}$

　 $=3!{}_nC_3\frac{N!}{d_1!d_2!d_3!}\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}$ --- (2)

For the last part, I used the fact that the integrand is symmetric with permutations of ${\mathbf p}$ .

Now we calculate the distribution of hyperparameter $n$ .

The prior is a uniform distribution between $1\le n\le 30$ .

$P(n)=\frac{1}{30}\,\,(1\le n\le 30)$

The likelihood marginalized with ${\mathbf p}$ is:

$P({\mathbf d}\mid n) = \int P({\mathbf d}\mid {\mathbf p}, n)P({\mathbf p}\mid n)\,d{\mathbf p} =\int \sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}$

　 $=\frac{N!}{d_1!d_2!d_3!}\sum_{i\ne j\ne k}\int p_i^{d_1}p_j^{d_2}p_k^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p} =\frac{N!}{d_1!d_2!d_3!}3!{}_nC_3\int p_1^{d_1}p_2^{d_2}p_3^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}\,\,(n \ge 3)$

For the last part, I used the fact that the integrand is symmetric with permutations of ${\mathbf p}$ .

Now, the posterior of $n$ becomes:

$P(n\mid {\mathbf d}) \propto P({\mathbf d}\mid n)P(n) =\frac{N!}{d_1!d_2!d_3!}3!{}_nC_3\int p_1^{d_1}p_2^{d_2}p_3^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}\times \frac{1}{3}$

　 $\propto {}_nC_3\int p_1^{d_1}p_2^{d_2}p_3^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}\,\,(3\le n\le 30)$ --- (3)

Since analytic evaluation of this integral is difficult, we use the numeric sampling with python code later. We assume that we have calculated (3) at the moment.

Finally, we evaluate the posterior of prevalence for each animal by marginalizing with $n$ . Since we don't know which of ${\mathbf p}$ corresponds to each animal, I'll sum up all possibilities. In the case of lion:

$P(p_{\rm lion}\mid {\mathbf d}) = \sum_n P(n) \sum_i \int P(p_i, {\mathbf p}'\mid {\mathbf d},n)|_{p_i=p_{\rm lion}} \,d{\mathbf p}'$

where ${\mathbf p}' = {\mathbf p} \setminus p_i$ . The integration with ${\mathbf p}'$ is to get marginalized with the prevalences of other animals.

And $P(p_i, {\mathbf p}'\mid {\mathbf d},n)$ is a probability where $p_i$ corresponds to $d_1$ . Using (1), it becomes:

$P(p_i, {\mathbf p}'\mid {\mathbf d},n)=\frac{1}{Z}\sum_{j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)$

Putting them together, we have:

$P(p_{\rm lion}\mid {\mathbf d}) = \sum_n P(n) \int \frac{1}{Z}\sum_{i,j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}|_{p_i=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}'$

　 $=\sum_n P(n) \frac{1}{Z}\sum_{i,j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}\int p_i^{d_1}p_j^{d_2}p_k^{d_3}|_{p_i=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}'$

　 $=\sum_n P(n) \frac{1}{Z}3!{}_nC_3\frac{N!}{d_1!d_2!d_3!}\int p_1^{d_1}p_2^{d_2}p_3^{d_3}|_{p_1=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)\,dp_2\cdots dp_n$

where we used the symmetry of the integrand again. Finally, by substituting (2):

$P(p_{\rm lion}\mid {\mathbf d}) = \sum_n P(n)\frac{\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,dp_2\cdots dp_n} {\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}}|_{p_1=p_{\rm lion}}$

　 $=\sum_n P(n)\int\frac{p_1^{d_1}p_2^{d_2}p_3^{d_3}|_{p_1=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)} {\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}}\,dp_2\cdots dp_n|_{p_1=p_{\rm lion}}$

　 $=\sum_n P(n) \int D({\mathbf p}, (1+d_1, 1+d_2, 1+d_3, 1,\cdots))\,dp_2\cdots dp_n|_{p_1=p_{\rm lion}}$

　 $=\sum_n P(n) {\rm Beta}(p_{\rm lion}\mid a=1+d_1, b=(1+d_2)+(1+d_3)+(n-3))$ --- (4)

We've done it! Here's the python code to calculate (3) and (4).

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import beta
from scipy.special import binom as binom_coef

d = np.array([3,2,1])
N = d.sum()

Iter_num = 4000

def calc_post_n(ns):
    total = np.array([0.0]*len(ns))
    for n in ns:
        if n < len(d):
            continue
        for i in range(Iter_num):
            ps = np.random.gamma(shape=1, scale=1, size=n)
            ps /= ps.sum()
            total[n] += binom_coef(n,len(d)) * (ps[:len(d)] ** d).prod()
    total /= total.sum()
    return total

def calc_post_p(ps, i, post_n):
    total = 0.0
    for n in range(len(d), 31):
        xs = np.ones(n, dtype=np.int)
        for j in range(len(d)):
            xs[j] += d[j]
        total += post_n[n] * beta(a=xs[i],b=(xs.sum()-xs[i])).pdf(ps)
    return total

# Main
if __name__ == '__main__':
    fig = plt.figure()
    subplot1 = fig.add_subplot(1,2,1)
    subplot2 = fig.add_subplot(1,2,2)

    ns = np.arange(0, 31)
    post_n = calc_post_n(ns)
    subplot1.plot(ns[len(d):], post_n[len(d):], label="Posterior n")
    subplot1.set_xlim(0,30)
    subplot1.legend(loc=0)

    ps = np.arange(0, 1, 0.01)
    post_ps = {}
    for i in range(len(d)):
        post_ps[i] = calc_post_p(ps, i, post_n)
        subplot2.plot(ps, post_ps[i],
            label="Posterior p%d (%d)" % (i, d[i]))
    subplot2.legend(loc=0)

    fig.show()