# A Fantastic Introduction to the Concept of Bayesian Statistics

## All you need to know to solve this Cambridge University entrance problem is Bayes’ Theorem

I recently came across this very hard looking and creatively articulated problem in an old entrance examination paper for Cambridge University. When I first looked at it, it seemed very complex. But one thing I have learned from years of practising Math, is that if you don’t immediately know what you are doing, take the time to write down the information you have. Often in the course of doing so, a method will reveal itself.

Let’s start with the question. Here it is. It gave me a little giggle to think about how writing a question like this is probably one of the rare opportunities the writer had to try to be creative 😂

## Writing down the information

There is a lot of information given here, so I decided to write it all down in some useful format before even tackling the question asked. Since the question primarily deals with combinations of ties and trousers, I decided a simple table would suffice to capture the information, so I created a table with the trouser choices in the rows, and the tie choices in the columns. Each cell in the table is the probability of the tie choice given the trouser choice.

Let’s look at how I put this table together. First, we are told that if I wear Brown trousers, I have equal likelihood of wearing any tie, except I will never wear the Cravat. This explains the first row. The second row is similar. For the third row, we are told that if I wear jeans on a weekday, half the time I will not wear any tie (probability 1/2) and the other half I will equally choose a tie (probability 1/4) , so the probability of any given tie is 1/8. The fact that this row adds to 1/2 confirms that I will only wear ties half the time. Finally, if I wear Jeans on a Sunday I will wear any tie with equal probability.

## Bayes’ Theorem

Now that I have written down my information, what am I actually being asked to do? I am told that I have been photographed wearing a black tie, and I am being asked to find the probability that the photo was taken on a Sunday. So I am looking to calculate the probability that it was a Sunday *given that* I was wearing a Black tie.

Any use of the term ‘*given that’ *invites the use of Bayes’ Theorem. While classical probability and statistics deals with the likelihood of an event with out any real prior knowledge of the context, Bayesian statistics deals with the likelihood of an event based on some prior knowledge of the context. The information about the context of the event is often known as the *prior* and the probability of your event based on the prior knowledge is known as the *posterior*.

Here, our priors can be obtained from the table above and we are being asked to calculate the posterior probability that it is a Sunday given the prior information. Bayes’ Theorem states that, for two events A and B, the probability of A occurring given that B occurred, is as follows:

In our case event A is ‘is a Sunday’ and event B is ‘wearing a black tie’. So we need to calculate three things:

**P(Black Tie | Sunday)**can be calculated from our table above. If it is a Sunday, then I will always wear a tie. There is an equal probability of me wearing Brown or Grey trousers or Jeans. So using my table, I can say that the probability of me wearing a Black tie on a Sunday is (1/3 x 1/3) + (1/3 x 1/3) + (1/3 x 1/4) = 2/9 + 1/12 = 33/108 = 11/36.**P(Sunday)**is clearly 1/7.**P(Black Tie)**again can be calculated from our table. The probability of wearing Brown Trousers and a Black Tie is 1/3 x 1/3 = 1/9. Similar for Grey Trousers and a Black Tie. The probability I am wearing Jeans on a weekday with a Black Tie is 6/7 x 1/3 x 1/8 = 1/28, and finally the probability I am wearing Jeans on a Sunday with a Black Tie is 1/7 x 1/3 x 1/4 = 1/84. So the total probability of me wearing a black tie is 2/9 + 1/28 + 1/84 = 2/9 + 4/84 = 2/9 + 1/21 = 51/189 = 17/63.

Putting all these together into our Bayesian formula above, we have that the probability of it being a Sunday given I am wearing a black tie is 11/36 x 1/7 x 63/17 = 11/68 as required.

## Updating the priors

The whole point of the final part of this question is to introduce us to the true value of Bayesian statistics over and above classical statistics. In real life, the information we have access to changes all the time. If our priors change, this will usually have an impact on the probability of the event we are interested in. A change in priors usually results in a change in posteriors.

Now in this situation, let’s detemine which of our priors will change:

**P(Black Tie | Sunday)**will not change because the new information we have does not effect events on a Sunday.**P (Sunday)**obviously does not change.**P (Black Tie)**does change, because there is a now a day when I am certain to wear jeans, which impacts the likelihood I am wearing a black tie.

We could therefore update our table as follows:

The combination of events in the rows give us probabilities, which we then multiply by the black tie column. So, for example, the probability of it being a Tuesday-Sunday and wearing Brown trousers is 6/7 x 1/3. Following this approach we can update our prior probability of wearing a black tie to 6/7 x 1/3 x 1/3 + 6/7 x 1/3 x 1/3 + 1 x 1/7 x 1/8 + 1/3 x 5/7 x 1/8 + 1/7 x 1/3 x 1/4 = 2/21 + 2/21 + 1/56 + 5/168 + 1/84 = 4/21 + 8/168 + 1/84 = 40/168 + 1/84 = 5/21 + 1/84 = 21/84 = 1/4.

So putting this into the same calculation as above, but replacing our previous prior probability of wearing a black tie (replace 17/63 with 1/4), we have a new posterior of 11/36 x 1/7 x 4 = 11/63.

## Why Bayesian statistics is so important

Although Bayesian statistics has been around for a long time, it’s only recently that it is starting to be widely used. A big reason for this is advances in computational science. Classical statistical methods make wide-ranging assumptions about the context of events that are being modelled — often they involve a *uniform prior* which assumes that any context at all is possible with equal likelihood. This makes them computationally more simple, and so they have been much preferred by statisticians historically because they offered realistic, practical routes to estimation of likelihood.

You can see from this problem that Bayesian approaches are computationally more complex, which has prevented their practical use in the past due to limited computational power available to us. Things are changing, and we now have access to computational power that allows us to use Bayes’ Theorem to take into account prior context when estimating likelihood of events, and to adjust that context and recalculate when we learn new information. This is how our brains work in general — we adapt to new information all the time—so Bayesian methods are a real step forward in modeling our comprehension of the world.

*I hope you enjoyed this little introduction to the concept of Bayesian statistics. If you are interested in learning more, I highly recommend the book ‘Statistical Rethinking’ by Richard McElreath. Feel free to leave a comment with your reactions or thoughts.*