This is an elementary expositional piece explaining how the N choose K formula is derived and used to calculate Binomial probabilities. For the complete lecture on the Binomial Distribution, and how it is used in time-branching probability models, see http://bit.ly/rsia08d
Given a sequence of time periods from 1,2,…, N, we can fill them up with outcomes H=1 and T=0 in 2^N different ways, since each slot can be filled in two ways. How many ways are there to put exactly K heads into this sequence of N time-periods? It is easier to understand solutions to abstract algebraic problems like this by first solving them for particular values of K and N. This makes it easier to understand intuitively the logic behind the answer. So we will first solve this problem for the special case that N=10 and K=4. How many ways can we put exactly four 1’s into a sequence of 10 slots, each of which can contain 0 or 1?
Special Case: N=10 and K=4
To solve this problem, it is useful to break it down into two steps. The first step is to solve a simpler problem.
FIRST STEP: Instead of putting four 1’s, we put four DIFFERENT items into four slots within a sequence of 10 slots. Let us name the four items A,B,C,D. Now we have four different objects, and we want to count how many ways we can put these into 10 slots. Now the basic counting formula provides a simple answer. The first object A can be put into any of 10 positions. The second object B can be put into any of the 9 remaining positions. The third object C can be put into any one of the 8 remaining positions. The fourth object D can be put into any of the remaining 7 positions. So the answer to this question is 10 x 9 x 8 x 7. In mathematical notation, N! is the product of all integers from 1 to N, so 10! = 10 x 9 x 8 x … x 1. If we want to go from 10 to 7, then we can eliminate the factors 6, 5, 4, 3, 2 by dividing by 6!. Using this notation, we can say that the number of ways to put 4 different objects into a sequence of 10 slots is 10! / 6!
SECOND STEP: How much overcount? The above formula is not the answer to our original question because it has an overcount. To see why consider the following sequences: C0A0B0D0000, A0C0B0D0000, D0C0A0B0000. These are all sequence of 10, with 6 0’s and four places with A, B, C, and D in them. If we replace A,B,C,D by 1, then all three will be the SAME sequence: 10101010000. But our first step formula counts all three of these as separate sequences. So the question is, how much overcounting do we do when we count different arrangements of A,B,C,D as separate sequences? Let us consider just one sequence 1111000000, where all four ones occur in the initial four positions. How many times is this sequence counted when we replace four 1’s by four different objects A,B,C,D?
All the ways we can put ABCD into the first four slots are equivalent when we replace ABCD by 1111. The first A can be put into any of 4 positions. The second object B can be put into any of the remaining 3 positions. The third object C can be put into any of the remain 2 positions. The last object D will have only one remaining position into which it can go. So the total number of ways we can put ABCD into the first four positions is 4! = 4 x 3 x 2 x 1.
This same reasoning holds for ANY placement of the four 1’s into the sequence of ten 0’s and 1’s. Every such sequence will be overcounted 4! times. Thus we get the answer to the original question by dividing the answer of the first step by 4!. The number of ways to put exactly 4 1’s into a sequence of 10 0’s and 1’s is 10! / [6! x 4!].
The General Case for any N and K
We can now just repeat the same reasoning to get the general answer. Suppose we have a sequence of N 0’s and 1’s. We want to count how many ways we can put exactly K 1’s into this sequence.
FIRST STEP is to replace the identical objects “1” by distinct objects; let us name them 1,2,3,…,K. The first object can be placed into any one of N positions. The second one can be put into N-1. Continuing in this way, the K-th object can put into any one of N-(K-1) positions. Multiplying these choices, the answer at the first step is: N x (N-1) x … x (N-K+1) = N! / (N-K)!
SECOND STEP: How much overcounting do we do when we replace identical 1’s by different objects? The sequence of numbers 1,2,…,K can be re-arranged in K! different ways. We can put the 1 into any one of K positions, the 2 into any of the remaining K-1 positions, and so one. All of the K! ways are identical when we replace the different objects by the same object “1”. This means that there is an overcount of K! in the first step. Thus the solution to the original problem divides the answer of the first step by K! arriving at the N choose K formula: N! / [K! x (N-K)!]
We can now state the final result of all of these calculations as follows. Suppose we have a sequence of N independent random events which have two possible outcomes named “1” and “0”. Suppose the probability of “1” is p for each of the N events. Let X by the random variable which counts the number of “1”’s which occur in this random trail. Then we say that X is a Binomial Random Variable with N trials and success probability p. This random variable X can take any integer value from 0 to N. The probabilities of each of these possible outcomes, PRIOR to the occurrence of the N events, is given by the following formula:
This formula is obtained by noting that when there are K 1’s and N-K 0’s, the probability of this outcome is pK (1-p)N-K . The number of outcomes which have K 1’s and N-K 0’s is N choose K which is N! / [K! x (N-K)!]. Adding up the probability pK (1-p)N-K for all of these outcomes leads to the formula given above. The video below provides a similar derivation of the fundamental N choose K formula.