









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The concepts of Bernoulli and Binomial distributions, focusing on the probability of success in independent trials. It covers the definitions, probability mass functions, mean and variance, and provides examples and exercises. from a university course on Biostatistics, MATH1015, at SydU, during the first semester of the academic year 2013.
What you will learn
Typology: Study notes
1 / 15
This page cannot be seen from the preview
Don't miss anything!
The binomial distribution plays a very important role in many life science problems. In order to develop this distribution, now we look at a related distribution called Bernouilli distribution.
Many life science experiments result in responses which have only two possible outcomes “success” (S) and “failure” (F). Such responses are called dichotomous. For example, a doctor is in- terested to know whether the recent medical examination gives ’positive’ or ’negative’ result for cancer for his patient.
Examples:
Definition: A random variable whose responses are dichoto- mous is called a Bernoulli random variable.
Note: In many problems, it is easy to use 1 for “success” (S) and 0 for “failure” (F).
In example 3, let X = 1 if the examination mark M is over 50 and 0 otherwise. Then X is the result of dichotomising the
random variable M such that
X =
1 if M ≥ 50 , (or pass/success) 0 if M < 50. (or fail/failure).
In general X = 1 denotes the event of a success (S).
Let p(r) = P (X = r). Therefore, the above can be written as
p(r) =
p if r = 1, 1 − p if r = 0. Therefore, the probability distribution of a Bernoulli RV X can be given as
x 1 0 P (X = x) p 1 − p
Then
Note: E(X^2 ) = 1^2 × p + 0^2 × (1 − p) = p and therefore, Var(X) = E(X^2 ) − [E(X)]^2 = p − p^2 = p(1 − p)
5.2.1 Probability mass function Consider the following example to understand the probability distribution of the number of successes associated with X ∼ B(n, p).
Example: When n = 4, it is clear from a tree diagram that there are 2^4 = 16 possible outcomes altogether. Let X be number of successes (S). All 16 outcomes are given below:
Outcome x Probability Outcome x Probability No. of case SSSS 4 p^4 FFFF 0 (1 − p)^4 each 1 SSSF 3 p^3 (1 − p) SFSS 3 p^3 (1 − p) SSFS 3 p^3 (1 − p) FSSS 3 p^3 (1 − p) 4 SSFF 2 p^2 (1 − p)^2 FFSS 2 p^2 (1 − p)^2 SFSF 2 p^2 (1 − p)^2 FSFS 2 p^2 (1 − p)^2 SFFS 2 p^2 (1 − p)^2 FSSF 2 p^2 (1 − p)^2 FFFS 1 p(1 − p)^3 FSFF 1 p(1 − p)^3 FFSF 1 p(1 − p)^3 SFFF 1 p(1 − p)^3
Then the probability distribution of or pmf 9probability mass function) of X is
P (X = 4) = p^4 P (X = 3) = 4 p^3 (1 − p) P (X = 2) = 6 p^2 (1 − p)^2 P (X = 1) = 4 p(1 − p)^3 P (X = 0) = (1 − p)^4
Binomial Coefficients
The number of ways of selecting 2 items from 4 is denoted by
2
and this value is 6. These are called the number of combinations or the binomial coefficients. That is,
2
= 6 gives the number of combinations choosing 2 trials for “S” from 4 trials. This is also denoted by 4 C 2 and read as “4 choose 2”.
There are many ways to calculate
2
. In a calculator, check for the button (^) nCr or
(n r
. Press 4 and then (^) nCr or
(n r
followed by
Exercise: Find 6 C 3 , 7 C 2 , and 9 C 7 from your calculator.
Answer: 20, 21 and 36.
Return back to any sequence of x “S” and n − x “F”, i.e.,
S S| {z· · · S} x S′s
F F| {z· · · F} (n−x) F ′s
It occurs at the same probability of px(1 − p)n−x. The number of combinations of choosing x trials for “S” from the totally n trials is
( (^) n x
. Therefore, the probability of exactly x successes out of n independent trials is given by
P (X = x) =
n x
px(1 − p)n−x, x = 0, 1 ,... , n.
Clearly, a binomial random variable is a sum of n independent Bernoulli random variables.
Example: A biologist estimates that the chance of germination for a type of bean seed is 0.7. A student was given 6 seeds. Let X be the number of seeds germinated from 6 seeds. Assuming that the germination of seeds are independent, explain why the distribution of X is binomial. What are the values of n and p? What are the probabilities that he gets
(a) all seeds germinated,
(b) just one seed not germinated, and
(c) at most four seeds germinated?
Solution: Since the germination of 6 seeds are independent and the outcome is binary, germinated or not, with the same proba- bility of germination being 0.7, the distribution of X is binomial, i.e. X ∼ B(6, 0 .7) with n = 6 and p = 0.7.
(a) P (X = 6) =
(b) P (X = 5) =
(c) P (X ≤ 4) = 1 − P (X ≥ 5)
0 1 2 3 4 5 6
(a)
(c)^ (b)
Exercise: Book P.55, Q2.
A binomial distribution can be plotted using a bar chart. The fol- lowing plots show different binomial distributions when p varies.
prob.
0 1 2 3 4
(a) p = 0. 1
0 1 2 3 4 5 6
(b) p = 0. 3
0 1 2 3 4 5 6
(c) p = 0. 5
0 1 2 3 4 5 6
(d) p = 0. 7
2 3 4 5 6
(e) p = 0. 9
Binomial distribution for n=
The following gives another set of plots when p = 0.1 and n increases. Clearly when n is small, the distribution is skewed but it becomes more symmetric as n increases.
Example: A pharmaceutical firm has discovered a new diag- nostic test that has 90% chance to indicate a positive result for a patient who are infected by a certain disease. If it is tried on 5 infected patients, find the probability that 4 will be detected.
Solution: Let X be the number of patients out of 5 infected patients who are diagnosed by the test. Since the results are independent across infected patients and the outcome is binary with the same probability of positive result, the distribution of X is binomial, i.e. X ∼ B(5, 0 .9) with n = 5 and p = 0.9.
Clearly there are
4
= 5 ways of choosing 4 “S” from 5 patients. Therefore,
Use of computer: In R, pbinom(x,n,p) gives Pr(X ≤ x), and dbinom(x,n,p) gives Pr(X = x).
Attend the tutorial for the use of R.
Use of binomial table: It gives P (X ≤ x) for n = 2,... , 12 and p = 0. 1 , 0. 2 ,... , 0 .9. For example, using the binomial tables with n = 5, p = 0.4 and x = 1, 2, we have
0 1 2 3 4 5
rP (X ≤ 1) rP (X ≤ 2) f
P (X = 2) = P (X ≤ 2) − P (X ≤ 1) = 0. 6826 − 0 .3370 = 0. 3456.
Exercise: Let X ∼ B(5, 0 .9). Find (a) P (X ≤ 4); (b) P (X = 4).
Solution: From the binomial table with n = 5, p = 0.9, x = 3, 4,
(a) P (X ≤ 4) = 0.4095. (b) P (X = 4) = P (X ≤ 4) − P (X ≤ 3). = 0. 4095 − 0 .0815 = 0. 3280.
Exercises:
Theorem: If X ∼ B(n, p), then the mean and variance of X are given by
μ = E(X) = np and σ^2 = Var(X) = np(1 − p)
because X = Y 1 + Y 2 + · · · + Yn is the sum of n independent Bernoulli r.v. Yi each with E(Yi) = p and Var(Yi) = p(1 − p).
Example: Let X ∼ B(8, 0 .60). Find E(X), Var(X) and SD(X).
Solution: We have n = 8 and p = 0. 6.
E(X) = np = 8 × 0 .6 = 4. 8.
Var(X) = np(1 − p) = 8 × 0. 6 × 0 .4 = 1. 92.
SD(X)=
6
p
p (1 − p)
5
25
0 1
ppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppp
Note that Var(X) = np(1 − p) increases from p > 0 and attains its maximum at p = 0.5 for a given n since the uncertainty is greatest when the success and failure are equally likely. Var(X) also increases with n for a given p.
The sum of 2 binomial random variables:
If X 1 ∼ B(n 1 , p) & X 2 ∼ B(n 2 , p), X = X 1 +X 2 ∼ B(n 1 +n 2 , p).
since X = Y 1 + · · · + Yn 1 + | {z } n 1 ; sum=X 1 ∼B(n 1 ,p)
. Note that this
result does NOT apply if p differs in X 1 and X 2.
Example: Let X 1 ∼ B(5, 0 .4), X 2 ∼ B(7, 0 .4) and X 3 ∼ B(7, 0 .2). Find the distributions of X 1 + X 2 and X 1 + X 3.
Solution: X 1 + X 2 ∼ B(5 + 7, 0 .4) = B(12, 0 .4) but the distribution of X 1 + X 3 is unknown.
Example: (Soft drinks) Two rival soft drinks, C and P taste the same. In a blindfold test, 12 people are asked (independently) to state their preference for one or the other.
(a) What is the probability that the majority prefer P? (b) How many people out of 12 people would prefer P?
Solution: Let X denote the number of people who prefer P out of 12 people. We have X ∼ B(12, 0 .5) with n = 12 and p = 0.5.
(a) P (X ≥ 7)
= 1 − P (X ≤ 6)
= 1 −
[( 12 0
)
. 50. 512 +
( 12 1
)
. 51. 511 + · · · +
( 12 6
)
. 56. 56
]
= 1 − 0. 512
[( 12 0
)
( 12 1
)
( 12 6
)]
= 1 − 0 .6128 = 0. 3872. (Table with n = 12, p = 0. 5 , x = 6)
(b) E(X) = np = 12 × 0 .5 = 6.