めもめも

このブログに記載の内容は個人の見解であり、必ずしも所属組織の立場、戦略、意見を代表するものではありません。

Analytic description of "Think Bayes"

Introduction

Think Bayes

Think Bayes

The book "Think Bayes" solves various problems with Bayesian inference without using mathematical (analytic) calculations. Instead of mathematics, it uses python codes to achieve the result. It gives a good insight on what's happening behind the mathematics. However, for those who have a mathematical background, analytic expressions may give yet another insight and help to understand. So in this article, I'll show analytic calculations on some example problems in the book.

Chapter3 - The locomotive problem

The prior distribution is:

P(N)=\frac{1}{N_\max}\,\,(N=1,\cdots,N_{\max})

where N is the number of locomotives the railroad owns, and N_{\max} is the upper bound of the possible numbers.

If you observed a locomotive with the number D, the likelihood is:


P(D\mid N)=\begin{cases}
    \frac{1}{N} & (D \le N) \\
    0 & (D>N)
\end{cases}

Then the posterior distribution becomes:

P(N\mid D) \propto P(D\mid N)P(N) = \begin{cases}
    \frac{1}{NN_\max} & (D \le N) \\
    0 & (D>N)
\end{cases}

The normalization constant is:

Z=\sum_{N=D}^{N_\max}\frac{1}{NN_\max} \sim \frac{1}{N_\max}\int_D^{N_\max}\frac{1}{x}\,dx = \frac{1}{N_\max}\log\frac{N_\max}{D}

Hence,

P(N\mid D)=\frac{P(D\mid N)P(N)}{Z} \sim \begin{cases}
    \frac{1}{\log \frac{N_\max}{D}}\times\frac{1}{N} & (N \ge D) \\
    0 & (N \lt D)
\end{cases}

Especially, when N_\max = 1000, D=60, N=60,

P(N\mid D) \sim \frac{1}{\log\frac{1000}{60}}\times\frac{1}{60} \fallingdotseq 0.0059

This matches with the result of Figure 3-1.

As another case, if you observed three locomotives with the number D_1, D_2, D_3\,\,(D_1\le D_2\le D_3), the likelihood is:


P(D_1,D_2,D_3\mid N)=\begin{cases}
    \frac{1}{N^3} & (D_3 \le N) \\
    0 & (D_3>N)
\end{cases}

Then the posterior distribution becomes:

P(N\mid D_1,D_2,D_3) \propto P(D_1,D_2,D_3\mid N)P(N) = \begin{cases}
    \frac{1}{N^3N_\max} & (D_3 \le N) \\
    0 & (D_3>N)
\end{cases}

The normalization constant is:

Z=\sum_{N=D_3}^{N_\max}\frac{1}{N^3N_\max} \sim \frac{1}{N_\max}\int_{D_3}^{N_\max}\frac{1}{x^3}\,dx

 = \frac{1}{2N_\max}\left(\frac{1}{D_3^2}-\frac{1}{N_\max^2}\right)
= \frac{1}{2N_\max}\frac{N_\max^2-D_3^2}{N_\max^2D_3^2}

Hence,

P(N\mid D_1,D_2,D_3)=\frac{P(D_1,D_2,D_3\mid N)P(N)}{Z} \sim \begin{cases}
    \frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\times\frac{1}{N^3} & (N \ge D_3) \\
    0 & (N \lt D_3)
\end{cases}

Now, the mean of the posterior is:

\sum_{N=D_3}^{N_\max}\left(N\times\frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\frac{1}{N^3}\right)
\sim\frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\int_{D_3}^{N_\max}\frac{1}{x^2}\,dx

 =\frac{2N_\max^2D_3^2}{N_\max^2-D_3^2}\left(\frac{1}{D_3}-\frac{1}{N_\max}\right)
=\frac{2N_\max D_3}{N_\max+D_3}

Especially, when N_\max = 500, 1000, 2000 and D_3=90,

N_\max=500:\, \frac{2\times 500\times 90}{500+90}  \fallingdotseq 152.5

N_\max=1000:\, \frac{2\times 1000\times 90}{1000+90}  \fallingdotseq 165.1

N_\max=2000:\, \frac{2\times 2000\times 90}{2000+90}  \fallingdotseq 172.2

This matches with the result in the book.

From this calculation, you can see that the posterior depends only on the largest number you observed.

Chapter6 - The Price is Right problem

We will calculate from Player1's view point with the following notation.

x_t : Actual price
x_g : Guessed price
x_b : Bid price
x^{(2)}_t : Actucal price of the opponent (Player2)
x^{(2)}_b : Bid price of the opponent (Player2)

We have the following data from past results.

p_1(x_t) : Distribution of actual price for Player1
p_2(x^{(2)}_t) : Distribution of actual price for Player2
D_1 = \{x_t-x_b\}: Set of price differences for Player1
D_2 = \{x^{(2)}_t-x^{(2)}_b\}: Set of price differences for Player2

Now we make some assumptions.

First, Player1's guess contains some error around the actual price. We assume the gaussian distribution with mean x_t and the same variance \sigma^2 as the past price differences. That is:

p(x_g\mid x_t) = N(x_g\mid x_t, \sigma^2)=N(x_t-x_g\mid 0, \sigma^2)

where \sigma^2 is a sample variance of D_1 and N(x\mid\mu,\sigma^2) is a gaussian distribution.

Then using p_1(x_t) as a prior distribution of the actual price, and using guessed price as an observed data, the posterior distribution becomes:

p_1(x_t\mid x_g) = \frac{1}{Z} N(x_t-x_g\mid 0, \sigma^2)p_1(x_t)

Where Z is a normalization constant. Based on this posterior distribution of the actual price, Player1 needs to take some strategy to decide the bid price x_b because there are the following trade-offs. (It's not wise to simply bid the maximum likelihood price.)

1. If x_b \sim x_t, the probability of overbid becomes large.

2. If x_b \ll x_t, the probability of losing to Player2 (Player2 makes more accurate bid) becomes large.

So, assuming that Player1's bid is x_b, we calculate the probability of winning as:

P_{\rm win} = P_0 + P_1(x_t-x_b)

where P_0 is a probability that Player2 overbids which is a constant inferred from D_2. P_1(x_t-x_b) is a probability that Player2 makes less accurate bids, that is, x^{(2)}_t-x^{(2)}_b > x_t - x_b. This is also inferred from D_2 for a fixed value of x_t-x_b.

Hence the expected gain with a bid x_b for some fixed actual price x_t can be calculated as:

g(x_b\mid x_t) =  \begin{cases}
    0 & (x_t < x_b) \\
    x_t \times \left\{P_0+P_1(x_t-x_b)\right\} & (x_t \ge x_b)
\end{cases}

Finally, using the posterior distribution of the actual price, the expected gain (without knowing the true price) is:

G(x_b) = \int_0^\infty g(x_b\mid x_t)p(x_t\mid x_g)\,dx_t

 = \int_{x_b}^\infty x_t \left\{P_0+P_1(x_t-x_b)\right\}\frac{1}{Z}N(x_t-x_g\mid 0, \sigma^2)p_1(x_t)\,dx_t

As an example, we'll have a rough estimation for x_g = 20000 and 0 < x_b < 5000. (According to the result in the book (Figure 6-4), the expected gain for this range will be constant around 7500.)

From the numeric result of the posterior N(x_t-20000\mid 0, \sigma^2)p_1(x_t) (Figure 6-3 in the book), this has a peak around x_t=25000. Hence,

G(x_b) \sim 25000 \times \left\{P_0+P_1(25000-x_b)\right\}

From the inferred probability of Player2's error (Figure 6-2 in the book), we can see the fact that P_0 = 0.3 and P_1 \sim 0 (because Player2 doesn't make an error larger than 20000, and our assumption is 25000-x_b>20000). As a result:

G(x_b) \sim 25000 \times 0.3 = 7500

This matches with the expected result. Yeah!

Chapter12 - Interpreting SAT scores

Before going into analysis, let's recap some basics of the beta distribution. The beta distribution gives the probabilistic distribution of variable \mu within the range 0\le \mu\le 1 with two parameters a, b. Suppose that we repeat some independent experiment which gives one of two results A or B for each trial (like a coin flip). The possibility \mu of having A from a single trial is unknown, but if the number of A is m and the number of B is l, the distribution of \mu can be estimated as {\rm Beta}(\mu\mid a=m+1,b=l+1).

For example, if m=l=1, the distribution of \mu is {\rm Beta}(\mu\mid a=2,b=2) which has a smooth mountain like shape with a peak at \mu=0.5.

The mathematical definition is given as:

{\rm Beta}(\mu\mid a,b)=\frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)}\mu^{a-1}(1-\mu)^{b-1}

If we use the beta distribution for a prior of variable \mu, and likelihood is given by the binomial form L \propto \mu^n(1-\mu)^{M-n}, the posterior becomes the beta distribution again. This can be easily understood from the following calculation.

L \times {\rm Beta}(\mu\mid a,b) \propto \mu^{a+n-1}(1-\mu)^{b+(M-n)-1}

This indicates that the posterior is given by {\rm Beta}(\mu\mid a+n,b+(M-n))

Now, let's go ahead with the problem. We'll evaluate the possibility \mu of giving a correct answer to one question. From the past results (Figure 12-1 in the book), the mean of \mu is around 0.5. So we assume that prior of \mu is given by {\rm Beta}(\mu\mid a=2,b=2).

If a student has the ability \mu, what is the probability density L that he/she gives correct answers to n questions from M questions in total? It is given by the binomial distribution, that is, L \propto \mu^n(1-\mu)^{M-n}.

Hence, form the discussion about the beta distribution, the posterior of \mu is given by {\rm Beta}(\mu\mid a=2+n,b=2+(M-n)).

Based on this result, we can compare the ability of two students with the following data.

 Total number of questions: M

 The number of correct answers for Student1: n=N_1

 The number of correct answers for Student2: n=N_2

The joint probability density of Student1's ability \mu_1 and Student2's ability \mu_2 is given by:

P(\mu_1, \mu_2) = {\rm Beta}(\mu_1\mid a=2+N_1,b=2+(M-N_1))\times{\rm Beta}(\mu_2\mid a=2+N_2,b=2+(M-N_2)).

We can draw a heatmap (like Figure 12-5 in the book) using python for some specific cases. We take the following three cases:

(M, N_1, N_2) = (10,8,6),\,(50,40,30),\,(100,80,60)

The ratio of correct answers (score of the exam) is the same for all cases. But you can see a difference from the actual heatmap below:

P1 and P2 is a ratio of the total probability in each triangle. P1 is the probability that Student1 gets better score than Student2 in the next exam. It shows that we have more confidence in their abilities if students give answers to more questions.

Here's the python code to draw the heatmap.

import matplotlib.pyplot as plt
from pandas import DataFrame
import numpy as np
from scipy.stats import beta

# Resolution
R = 60

# (Max, Student1, Student2)
Dataset = [(10,8,6), (50,40,30), (100,80,60)]

def post(m, k, p):
    return beta(a=2+k,b=2+m-k).pdf(p)

def drawmap(d, subplot):
    M,N1,N2 = d
    X = np.linspace(0,1,R)
    Y = np.linspace(0,1,R)
    field = DataFrame(np.zeros(shape=(len(X),len(Y))))
    p1 = 0.0
    p2 = 0.0
    for x, xval in enumerate(X):
        for y, yval in enumerate(Y):
            p = post(M, N1, xval)*post(M, N2, yval)
            field.ix[y,x] = p
            if x == y:
                p1 += p * 0.5
                p2 += p * 0.5
            elif x > y:
                p1 += p
            else:
                p2 += p
    z = p1 + p2
    p1 = p1 * 100 / z
    p2 = p2 * 100 / z

    subplot.set_title("Number of questions: %d" % M)
    subplot.set_xlabel("Student1")
    subplot.set_ylabel("Student2")
    subplot.imshow(field.values, 
        extent=(0,1,0,1), origin='lower', interpolation='nearest')
    subplot.plot([0,1], color='white')
    subplot.text(0.1, 0.9, 'P2 = %2.0f%%' % p2,
        horizontalalignment='left', verticalalignment='top',
        bbox={'facecolor':'white', 'alpha':1, 'pad':10})
    subplot.text(0.9, 0.1, 'P1 = %2.0f%%' % p1,
        horizontalalignment='right', verticalalignment='bottom',
        bbox={'facecolor':'white', 'alpha':1, 'pad':10})

# Main
if __name__ == '__main__':
    fig = plt.figure()
    for i, d in enumerate(Dataset):
        subplot = fig.add_subplot(1,3,i+1)
        drawmap(d, subplot)
    fig.show()

Chapter 14 - The Geiger counter problem

First we calculate the posterior of an emission count n under a fixed emission rate r. In other words, all probabilities are conditioned with r.

The prior of n is given by the Poisson distribution:

P(n\mid r) = \frac{r^ne^{-r}}{n!} --- (1)

The likelihood of counting k particles is given by the Binomial distribution:

P(k\mid n,r) = {}_nC_kf^k(1-f)^{n-k} --- (2)

Hence, the posterior of n becomes:

P(n\mid k,r) \propto P(k\mid n,r)P(n\mid r) = \frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k} --- (3)

Then we calculate the posterior of r.

The prior of r is a uniform distribution between 0 \lt r \lt 500.

P(r) = \frac{1}{500}\,\,(0\lt r \lt 500)

The likelihood of counting k particles is given by marginalizing with n:

P(k\mid r)=\sum_{n=0}^\infty P(k\mid n,r)P(n\mid r)
= \sum_{n=0}^\infty \frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k} (from (1)(2))

Hence, the posterior of r becomes:

P(r\mid k) \propto P(k\mid r)P(r)
=\frac{1}{500}\sum_{n=0}^\infty \frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k} (for 1 \le r \le 500)

 \sim \frac{1}{500}\sum_{n=0}^{500}\frac{r^ne^{-r}}{n!}{}_nC_kf^k(1-f)^{n-k} --- (4)

where we cut off the summation by n=500 for the sake of numeric calculation performed later.

Finally, the posterior of n which is marginalized with r is:

P(n\mid k) =\int_0^{500} P(n\mid k,r)P(r\mid k)\,dr \sim \sum_{r=1}^{500} P(n\mid k,r)P(r\mid k) --- (5)

The result (3), (4) and (5) can be evaluated with a numeric calculation using python as below.

Here's the python code to draw it.

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import binom, poisson

f = 0.1
k = 15

def postn_conditioned_r(ns, r): # (3)
    result = np.zeros(500)
    ns = np.arange(1, 501)
    result = poisson(r).pmf(ns) * binom(p=f,n=ns).pmf(k)
    result /= result.sum()
    return result
    
def calc_postr(rs):             # (4)
    result = np.zeros(500)
    for n in range(1, 501):
        result += poisson(rs).pmf(n) * binom(p=f,n=n).pmf(k)
    result /= result.sum()
    return result

def calc_postn(ns, post_r):     # (5)
    result = np.zeros(500)
    for r in range(1, 501): 
        result += postn_conditioned_r(ns, r) * post_r[r-1]
    result /= result.sum()
    return result

# Main
if __name__ == '__main__':
    fig = plt.figure()
    subplot = fig.add_subplot(1,1,1)

    rs = np.arange(1, 501)
    post_r = calc_postr(rs)
    subplot.plot(rs, post_r, label="Posterior r")

    ns = np.arange(1, 501)
    post_n = calc_postn(ns, post_r)
    subplot.plot(ns, post_n, label="Posterior n")

    subplot.legend(loc=0)
    fig.show()

Chapter15 - Lions and tigers and bears

First, we calculate the prevalence conditioned with n.

{\mathbf p}=(p_1,\cdots,p_n) : Prevalence of each animal.

where \sum_{i=1}^np_i=1

The prior of {\mathbf p} is given by the Dirichlet distribution with parameter {\mathbf 1}_n = (1,\cdots,1):

P({\mathbf p}\mid n) = D({\mathbf p}, {\bf 1}_n)

Likelihood of finding {\mathbf d}=(d_1,d_2,d_3) animals, say, d_1 lions, d_2 tigers and d_3 bears, is:

P({\mathbf d}\mid{\mathbf p},n) = \begin{cases}
    \sum_{i,j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3} & (n \ge 3) \\
    0 & (n \lt 3)
\end{cases}

where N=d_1+d_2+d_3

The tricky summation comes from the fact that we don't know which of n animals correspond to what we found (lions, tigers and bears). Hence we count all possible combinations.

Then the posterior becomes:

P({\mathbf p}\mid {\mathbf d},n) = \frac{1}{Z}P({\mathbf d}\mid{\mathbf p},n)P({\mathbf p}\mid n)
=\frac{1}{Z}\sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)\,\,(n \ge 3) --- (1)

where Z is a normalization constant depending on {\mathbf d},n. Explicitly:

Z=\int\sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}
=\sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}\int p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}

 =3!{}_nC_3\frac{N!}{d_1!d_2!d_3!}\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p} --- (2)

For the last part, I used the fact that the integrand is symmetric with permutations of {\mathbf p}.

Now we calculate the distribution of hyperparameter n.

The prior is a uniform distribution between 1\le n\le 30.

P(n)=\frac{1}{30}\,\,(1\le n\le 30)

The likelihood marginalized with {\mathbf p} is:

P({\mathbf d}\mid n) = \int P({\mathbf d}\mid {\mathbf p}, n)P({\mathbf p}\mid n)\,d{\mathbf p}
=\int \sum_{i\ne j\ne k}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}

 =\frac{N!}{d_1!d_2!d_3!}\sum_{i\ne j\ne k}\int p_i^{d_1}p_j^{d_2}p_k^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}
=\frac{N!}{d_1!d_2!d_3!}3!{}_nC_3\int p_1^{d_1}p_2^{d_2}p_3^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}\,\,(n \ge 3)

For the last part, I used the fact that the integrand is symmetric with permutations of {\mathbf p}.

Now, the posterior of n becomes:

P(n\mid {\mathbf d}) \propto P({\mathbf d}\mid n)P(n)
=\frac{N!}{d_1!d_2!d_3!}3!{}_nC_3\int p_1^{d_1}p_2^{d_2}p_3^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}\times \frac{1}{3}

 \propto {}_nC_3\int p_1^{d_1}p_2^{d_2}p_3^{d_3} D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}\,\,(3\le n\le 30) --- (3)

Since analytic evaluation of this integral is difficult, we use the numeric sampling with python code later. We assume that we have calculated (3) at the moment.

Finally, we evaluate the posterior of prevalence for each animal by marginalizing with n. Since we don't know which of {\mathbf p} corresponds to each animal, I'll sum up all possibilities. In the case of lion:

P(p_{\rm lion}\mid {\mathbf d}) = \sum_n P(n) \sum_i \int P(p_i, {\mathbf p}'\mid {\mathbf d},n)|_{p_i=p_{\rm lion}} \,d{\mathbf p}'

where {\mathbf p}' = {\mathbf p} \setminus p_i. The integration with {\mathbf p}' is to get marginalized with the prevalences of other animals.

And P(p_i, {\mathbf p}'\mid {\mathbf d},n) is a probability where p_i corresponds to d_1. Using (1), it becomes:

P(p_i, {\mathbf p}'\mid {\mathbf d},n)=\frac{1}{Z}\sum_{j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}D({\mathbf p}, {\bf 1}_n)

Putting them together, we have:

P(p_{\rm lion}\mid {\mathbf d}) = \sum_n P(n) \int \frac{1}{Z}\sum_{i,j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}p_i^{d_1}p_j^{d_2}p_k^{d_3}|_{p_i=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}'

 =\sum_n P(n) \frac{1}{Z}\sum_{i,j,k(i\ne j\ne k)}\frac{N!}{d_1!d_2!d_3!}\int p_i^{d_1}p_j^{d_2}p_k^{d_3}|_{p_i=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}'

 =\sum_n P(n) \frac{1}{Z}3!{}_nC_3\frac{N!}{d_1!d_2!d_3!}\int p_1^{d_1}p_2^{d_2}p_3^{d_3}|_{p_1=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)\,dp_2\cdots dp_n

where we used the symmetry of the integrand again. Finally, by substituting (2):

P(p_{\rm lion}\mid {\mathbf d}) = \sum_n P(n)\frac{\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,dp_2\cdots dp_n}
{\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}}|_{p_1=p_{\rm lion}}

 =\sum_n P(n)\int\frac{p_1^{d_1}p_2^{d_2}p_3^{d_3}|_{p_1=p_{\rm lion}}D({\mathbf p}, {\bf 1}_n)}
{\int p_1^{d_1}p_2^{d_2}p_3^{d_3}D({\mathbf p}, {\bf 1}_n)\,d{\mathbf p}}\,dp_2\cdots dp_n|_{p_1=p_{\rm lion}}

 =\sum_n P(n) \int D({\mathbf p}, (1+d_1, 1+d_2, 1+d_3, 1,\cdots))\,dp_2\cdots dp_n|_{p_1=p_{\rm lion}}

 =\sum_n P(n) {\rm Beta}(p_{\rm lion}\mid a=1+d_1, b=(1+d_2)+(1+d_3)+(n-3)) --- (4)

We've done it! Here's the python code to calculate (3) and (4).

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import beta
from scipy.special import binom as binom_coef

d = np.array([3,2,1])
N = d.sum()

Iter_num = 4000

def calc_post_n(ns):
    total = np.array([0.0]*len(ns))
    for n in ns:
        if n < len(d):
            continue
        for i in range(Iter_num):
            ps = np.random.gamma(shape=1, scale=1, size=n)
            ps /= ps.sum()
            total[n] += binom_coef(n,len(d)) * (ps[:len(d)] ** d).prod()
    total /= total.sum()
    return total

def calc_post_p(ps, i, post_n):
    total = 0.0
    for n in range(len(d), 31):
        xs = np.ones(n, dtype=np.int)
        for j in range(len(d)):
            xs[j] += d[j]
        total += post_n[n] * beta(a=xs[i],b=(xs.sum()-xs[i])).pdf(ps)
    return total

# Main
if __name__ == '__main__':
    fig = plt.figure()
    subplot1 = fig.add_subplot(1,2,1)
    subplot2 = fig.add_subplot(1,2,2)

    ns = np.arange(0, 31)
    post_n = calc_post_n(ns)
    subplot1.plot(ns[len(d):], post_n[len(d):], label="Posterior n")
    subplot1.set_xlim(0,30)
    subplot1.legend(loc=0)

    ps = np.arange(0, 1, 0.01)
    post_ps = {}
    for i in range(len(d)):
        post_ps[i] = calc_post_p(ps, i, post_n)
        subplot2.plot(ps, post_ps[i],
            label="Posterior p%d (%d)" % (i, d[i]))
    subplot2.legend(loc=0)

    fig.show()

The result for (d_1,d_2,d_3)=(3,2,1) is as below:

The posterior of n is the same as figure 15-2 in the book.

What happens if we found more animals with the same ratio? The following is the result for (d_1,d_2,d_3)=(9,6,3).

We have more confidence on the number of species now.