4A: Probability Basics and the Binomial Distribution 2/14/07

Review Questions

  1. StatPrimer defines probability in three ways. What is the common thread among these? 
  2. Define "random variable."
  3. There are two distinct types of random variables. Name these. 
  4. Identify whether each of the following random variables is discrete or continuous: (a) number of female children in a family (b) number of successful treatments out of n (c) systolic blood pressure of an individual chosen at random.
  5. What is a Bernoulli trial?
  6. Binomial random variables have two parameters. Name these and describe what they are. 
  7. How many different ways can you choose 2 items out of 10?
  8. The area under a probability histogram sums to exactly ___ . 
  9. 0! = ?

Exercises

4A.1 Probability of survival. An patient is newly diagnosed with a certain type of cancer. The attending physician tells the patient that the probability of survival is 80%. Let’s assume this estimate is accurate. The patient, who has no statistical background, asks what this exactly means.  Explain the meaning of this probability estimate to the patient in terms he or she will understand.

4A.2 Roll of the die. A standard die has six faces: one with one spot on it, one with two spots, and so on. By  logic we say that the chance a die lands so a "one" will show is 1 in 6. How would you design an experiment to estimate this probability? 

4A.3 February birthdays. What is the probability of being born on: (A) Feb 28th? (B) Feb 29th? (C) Feb 28 or Feb29th?

4A.4 Survival following treatment. In a clinical trials in the treatment of childhood leukemia, 475 of 601 patients survive at least 5 years following treatment.

(A) Based on this information, estimate the probability of surviving at least 5 years following treatment.
(B) Let's assume the estimated probability is accurate. What is the probability of not surviving 5 or more years?

4A.5 Sampling a small finite population, N = 26. Suppose a population has 26 members identified with the letters A through Z. 

(A) You randomly select one individual. What is the probability that you select individual A?
(B) Assume person A gets selected on an initial draw and we sample again with replacement. What is the probability of drawing person A again?
(C) Assume person A gets selected on the initial draw and we sample without replacement. What is the probability of drawing person A again?

4A.6 Autosomal recessive. An abnormal gene on one of autosomal chromosomes from each parent is required to cause an autosomal recessive genetic disease. People with only one abnormal gene in the gene pair are carriers but will not exhibit the disease. (The normal gene of the pair can supply the function of the gene so that the abnormal gene is described as acting in a recessive manner.) Therefore, both parents must be carriers in order for a child to receive both defective genes and experience disease. Assuming contribution of autosomal genes parents is random, what is the probability of inheriting an autosomal recessive trait if both parents are carriers? [ http://ghr.nlm.nih.gov/handbook/illustrations/autorecessive  .] 

4A.7 Coin flipping. Flip or flick a coin 30 times. Count the number of tails. 

(A) Based on the 30 flips, estimate the probability of tails via the observed proportion. 
(B) It is unlikely you observed 15 tails. (About 1 in 7 experiments will derive exactly 15 of 30 tails.) Why do most experiments fail to derive the expected number of tails based on a probability of 0.50? 

4A.8 Continuous or discrete? Determine whether the following random variables are continuous or discrete.

(A) Number of successful treatments in 4 patients treated. 
(B) Average score on an exam.
(C) Number of patients visiting a clinic on a particular day.
(D) Mean birth weight in a random sample.

4A.9 Breast cancer. The lifetime risk of female breast cancer is approximately 1 in 10. You select 3 women at random from and want to determine probabilities of occurrence. 

(A) Build the probability distribution for the number of women in the sample who will ultimately develop breast cancer in the SRS of n = 3.
(B) How likely would it be to find a sample in which all three develop breast cancer? Provide three different explanations for such an observation. 

4A.10 Smoking on campus. The prevalence of smoking on a college campus is 20%. You select 2 students at random from the population. 

(A) Build the probability mass function for the number of smokers in your sample. 
(B) What percentage of samples will have 2 smokers?

4A.11 Childhood asthma. The prevalence of asthma in a pediatric population is 1 in 20. You select 20 children at random from this population.

(A) Use statistical notation to identify the probability mass function that describes the number of asthmatics in the sample. 
(B) What is the probability that no children in the sample will have asthma?
(C) What is the probability one child in the sample will have asthma?
(D) What is the probability one or fewer children will have asthma? [Note: Pr(X  1) = Pr(X = 0) + Pr(X = 1).]
(E) What is the probability at least 2 will have asthma? ["Probability of at least 2"  = Pr(X 2) = 1 - Pr(X 1).]

4A.12 Diabetes in a geriatric community. Suppose the prevalence of diabetes in a geriatric population is 10%. We take a simple random sample of n = 15 people from this population.

(A) Let X represent the number of diabetics in a random sample of n = 15. What is the likelihood of seeing no cases in your sample?
(B) What is the probability of seeing exactly one case?
(C) What is the probability of seeing 1 or fewer cases? [Pr(X 1) = Pr(0) + Pr(1).]
(D) What is the probability of at least two cases? [Pr(X 2 ) = 1 - Pr(X 1).]
(E) Would it be surprising to find a sample with 2 diabetics?

4A.13 All ten. The probability of spontaneous recovery is 95%. What is the probability 10 patients in a row recover spontaneously?

4A.14 Two failures. A treatment is 90% effective. What is the probability it fails in two of two patients selected independently for treatment? 

4A.15 Prevalence 77%. The prevalence of a characteristic in a population is 0.768. You select 10 individuals at random from this population. What is the probability 9 or more have  the characteristic?

4A.16 X~b(13, 0.67). What is the probability a treatment that is 67% effective will work in at least 12 of 13 patients?

4A.17 X~b(3, 0.05). The lifetime risk of of a disease is 5%. What is the probability 2 of 3 people selected at random will develop the disease?

4A.18 Personal expressions of probability. The probability associated with a proposition can be used to quantify the personal belief of a judgment. At one extreme, a probability statement of 0 represents the personal belief that the proposition can never happen. At the other extreme, a probability statement of 1 represents a personal belief that the proposition will always occur. Probabilities between these two extremes represent shades of gray. Match each of the percentages -- 95%     80%     50%     20%     5% -- with the narrative statement listed in (a) – (g) that most closely reflects its meaning.

(a)  This seldom happens. It has a very low chance of occurring.
(b)  This event is infrequent; it is unlikely.
(c)    This happens as often as not; chances are even.
(d)   This is very frequent and occurs with high probability.
(e)    This event almost always occurs; it has a very high probability of occurrence.

Key to Odd Numbered Problems                             Key to Even Numbered Problems (may not be posted)