# Additional exercises for Chapters 5 & 6 Version: 7/21/06

## Confidence intervals for µ (population standard deviation is not known)

1. Blood pressure. A study found a mean systolic blood pressure of = 124.6 mm Hg in 35 individuals. The standard deviation s = 10.3 mm Hg.
(A) Calculate the estimated standard the error of the mean.
(B) How many people would you have to study to decrease the standard error of the mean to 1 mm Hg? [Recall that se = s /
n. Rearrange this formula to solve for n. Then plug-in assumptions for  se and s to derive sample size requirement.]

2. Published report A study published in the American Journal of Public Health (Langenberg, 2005) addressed the statistical relation  between tall stature, cardiovascular mortality, and employment grade. Results were reported in a table with the column heading “Mean Height, cm. (SE).” The table entry for “Stroke in the Low Employment Grade” was 173.2 (0.2) based on n = 1243. From this table, you are supposed to understand that x-bar = 173.2 and the standard error of the mean  = 0.2.  What is the standard deviation of the data in this sample? [Rearrange the formula for the sem to solve for s. Then plug the values of n and sem into the formula.]

3. t curve. This exercise is intended to help you become familiar with the characteristics of t distributions.
(A) Sketch a t curve. To the eye, this curve will look like a z curve (i.e., have mean 0, points of inflection approximately 1 unit above and below the mean, and so on). Label the horizontal axis with tick marks that at 1-unit intervals.
(B) Use the t
table to determine the t quantile with 9 degrees of freedom and cumulative probability 0.90 (i.e., t9,.90). Place this value on the horizontal axis of the curve and shade the region under the curve to its right. The area in the right tail = 1 - 0.90 = 0.10.
(C) Use the symmetry of the t curve to determine the t quantile that cuts off the bottom 10% of the curve (i.e.,  t9,.10). Shade the region to the left of this point.
(D) What is the combined area of the shaded regions of the curve you just sketched?

4. t quantiles. Use your t table to determine the following t quantiles:
(A) t19,.95 [This is the t quantile with 19 degrees of freedom and cumulative probability 0.95.]
(B) t24,.975
(C) t35,.975
(D) t674,.99 [A t distribution with this many degrees of freedom is nearly the same as a  z distribution; use the row in the t table for z.]
(E) t19,.05  [Use your knowledge of the symmetry of the t distribution to determine the mirror image of  t19,.95.]
(F) t19,.025  [This is the mirror image of t19,.975.]

5. Approximating the areas beyond a t quantile. Sometimes you will need to determine the area under the curve to the right or left of a t quantile that does not appear in the body of the t table. For example, you may need to determine the area in the tail beyond a tstatistic of 2.65 with 8 degrees of freedom. Even though this t quantile does not appear in the table, you can still derive its approximate probability by bracketing it between landmarks that are listed in the t table. In this case, a tstatistic of 2.65 with 8 df is bracketed between t8,.975 (2.31) and t8,.99 (2.90). This shows it to have a cumulative probability that is a little bigger than 0.975 and a little smaller than 0.99.
(A) Sketch the t8 distribution curve (see exercise 3 for instruction), showing t8,.975 and t8,.99 on the horizontal axis of the curve. Wedge " 2.65" between these landmarks.
(B) What is the approximate size of the area under the curve to the right of 2.65 under this curve?
(C) Use StaTable or other software to determine area under the curve (exact probabilities) beyond 2.65 on a t distribution with 8 degrees of freedom.

6. More t probabilities
(A) Sketch the probability (area under the curve) of observing a t quantile with 9 df that greater than 2.82. Include t quantile landmarks on the horizontal axis of the sketch that bracket the 2.82. What is Pr(T9 >  2.82)?
(B)  What is the probability of seeing a t quantile with 9 df  that is less than -2.98?

7. t critical values for a confidence interval. You have a SRS of n = 28 individuals. What is the value of the t quantile (critical value) would you use to calculate a 95% confidence interval for µ?

8. t for confidence. You have a SRS of n = 28 and want to calculate a 90% confidence interval. What t quantile would you use (from the t table) for your calculation?

9. Red wine (based on Nigdikar et al. 1998; Moore, 2003, pp. 416, 643).  Drinking moderate amounts of wine may reduce the risk of coronary artery disease in some individuals. One possible reason for this is that red wine contains polyphenols, and polyphenols help serum cholesterol profiles. In an en experiment involving 9 men, the subjects drank half a bottle of red wine each day for two weeks. Level of polyphenols in blood samples were  measured at the beginning and end of the experiment. Percent change in polyphenols levels are {3.5, 8.1, 7.4, 4.0, 0.7, 4.9, 8.4, 7.0, 5.5}. Calculate a 95% confidence interval for the mean percent change in polyphenols if all men drank this amount of red wine.

10. Calcium in sound teeth. A dental researcher measures the calcium content of sound teeth (% of tooth content that is calcium). A sample of 5 teeth shows the following values {33.4, 36.2, 34.8, 35.2, 35.5}.Provide a 99% confidence interval for the mean percent calcium content of sound teeth. [You may use your calculator to find the mean and standard deviation. Please calculate the confidence interval by hand, showing all work.]

11. Boy height. A SRS of n = 26 boys between the ages of 13 and 14 reveals a mean height of 63.8 inches with a standard deviation of 3.1 inches. Assume height in the population varies according to a Normal distribution. Calculate a 95% CI for the mean height of all boys in this age range.

12. Vector control in an African village. A study of insect vector control in an African village found that the mean sprayable surface area of 100 houses was 249 square feet with standard deviation =  39.82 square feet. (Data are fictitious but realistic; see Osborn, 1979, p. 6 for full data set.)
(A) Determine the 95% confidence interval for the mean sprayable surface of houses in the village.
(B)
Would it be correct to say that 95% of all the houses in the village have sprayable surfaces between the lower confidence limit and upper confidence limit? Explain your response.

13. Respiratory function in furniture workers Forced expiratory volume (FEV) is a measure of respiratory health in which you  forcibly blow through a tube. The rate of air expelled (liters per second) is measured as an index of lung function. FEV in seven workers at a furniture manufacturing plant are {3.94, 1.47, 2.06, 2.36, 3.74, 3.43, 3.78}. Calculate a 90% CI for the mean FEV for the population of furniture workers.

14. COPD (Rosner, 1990, p. 177). Skin-fold thickness taken at the triceps region averages 1.35 cm (standard deviation = 0.50 cm) in a sample of 40 healthy male controls with normal respiratory function. In 32 men with chronic obstructive pulmonary disease, skin thickness at the triceps region averages 0.92 cm (standard deviation = 0.40 cm).
(A) Calculate 95% confidence intervals for the skin fold thickness in the healthy population.
(B) Calculate 95% confidence intervals for the skin fold thickness in the population of men with chronic obstructive pulmonary disease.
(C) Plot the above confidence intervals in side-by-side fashion on graph paper. Compare the intervals. Interpret your results.

15. Body weight, high school girls. Body weight expressed as a percentage of ideal in 9 high school girls expressed are: {114, 100, 104, 94, 114, 105, 103, 105, 96}.
(A)  Plot the data as a stemplot using split-stem values. Are there any major departures from Normality in the data?
(B)  Assume these 9 girls represent a SRS from their school. Calculate a 95% confidence for population mean µ of this variable in the school. Show all work.
(C)  What is the margin of error of your estimate? (Numerical value.)
(D)   How large a sample would be needed to reduce the margin of error of the 95% confidence interval down to 3?

16. Treatment of scrapie.  Scrapie is a prion disease similar in pathology to bovine spongiform encephalopathy (mad cow disease) and new variant Creutzfeldt-Jakob disease. In a trial of a substance used to treat scrapie in hamsters, 10 scrapie-infected hamsters chosen at random where treated with the substance and 10 scrapie infected hamsters were left untreated (Tagliavini, 1997)

(A) The mean time before the appears of symptoms in the treated group (induction time) was 81.9 days (se = 2.2 days). Solve the formula for the standard error for sample standard deviation s. (se = s / n.) Use this to determine the standard deviation in this group.
(B) The mean induction time in the control group was 102.8 days (se = 3.8 days). What was the standard deviation of the data in this group?

## Test of H0: µ = µ0 (population standard deviation not known)

17. P-value from tstat. A one-sample t statistic for H0: µ = 0  based on n = 16 is tstat = 2.44.
(A) How many degrees of freedom are associated with this test statistic?
(B) Provide the t quantiles from the t table that bracket the tstat
(C) What are the right-tail probabilities for the bracketing t quantiles?
(D) What is the one-sided P-value for this problem?
(E) What is the two-sided P-value?

18. Critical value. You take a SRS of n = 21 from a Normal population to test H0: µ = 0 versus H1: µ > 0. What values of the tstat will give a P-value that is less than or equal to 0.01?

19.  Beware a = 0.05. Two trials looked at red wine consumption in lowering overall cholesterol levels in hypercholesterolemic men. These fictitious studies were done under identical conditions. In each trial, men consumed eight ounces of red wine per day.
(a) In trial A, 25 subjects lowered their cholesterol by an average of 5.percent. The standard deviation of the change was 11.9 percent. In testing H0: μ = 0, tstat = 2.10, df = 24, and P = 0.0464. Is this study significant at alpha = 0.05?
(b) In trial B, 25 different subjects lowered their cholesterol by 5% with standard deviation 12.2 percent. The tstat = 2.05, df = 24, and P = 0.0514. Is this result significant at alpha = 0.05?
(c) Is it reasonable to come to a different conclusion for trial A and trial B?

20. Red wine. Test the data in exercise 9 for a significant change in polyphenols levels. (Note: No mean change implies µ = 0.)  Conduct a flexible significance test with a two-sided alternative. Show all testing steps (except for step B).

21. Menstrual cycle length Menstrual cycles length (days) in a random sample of 9 women are {31, 28, 26, 24, 29, 33, 25, 26, 28}. Assume the population is Normal,  Test whether the mean length of the menstrual cycle in this population is a lunar month. (A lunar month is 29.5 days.) Show all hypothesis testing steps, including statements of the null and alternative hypothesis.

22. Behavioral problems in stressed adolescents. Because there is evidence that stress in a child's life may lead to behavioral problems later in life, it might be expected that a sample of children who have been subjected to an unusual amount of stress would show an unusually high level of behavioral problems. On the other hand, children with high stress levels could feel that they have had enough going on in their lives without complicating matters further, so they might actually show an unusually low number of behavioral problems. To investigate this question, we selected at a SRS of 5 children, each of whom is under a high level of social stress. We examine these children and assign a score of 50 represents an average amount of behavioral problems. Data are {48, 62, 53, 51, 58 }. Test whether these scores are significantly different from the established norm of 50. Show all hypothesis testing steps.

23. Cholesterol in Asian immigrants (Rosner, 2000, p. 223). We want to compare fasting serum cholesterol levels of recent Asian immigrants to that of the overall U. S.  population. Assume cholesterol levels in 20- to 39-years old women in the United States is Normal with µ  = 190 mg/dl. Blood tests are preformed on 100 female Asian immigrants in this age range. The mean cholesterol level in this sample is 181.52 mg/dl (standard deviation = 40 mg/dl). Conduct a two-sided test to determine whether the recent immigrants have lower average cholesterol levels than their native counterparts. Show all hypothesis testing steps.

24. Lowering elevated heart rate. A new calcium channel blocking agent is tested in 9 patients with unstable angina. Resting heart rate after 48 hours of treatment decreased an average of 5 beats per minute (standard deviation = 2.5). Is this drop significant. (Use a two-sided alternative.)

25. SIDS. A sample of birth weights (in grams) of 10 infants who had died of Sudden Infant Death Syndrome (SIDS) in a large metropolitan area was {2998, 3740, 2031, 2804, 2454, 2780, 2203, 3803, 3948, 2144}. The mean weight of all births in this metropolitan was 3300 grams. Is the mean birth weight of SIDS cases significantly different from that of the rest of the population? [Same as illustrative example in biostat-text.]

26. Everley's syndrome. Determination of plasma calcium concentration in the 18 patients between the age of 20- and 44-years with Everley's syndrome gave a mean of 3.2 mmol/l, with standard deviation 1.1. (Everley's syndrome is a rare congenital disorder. High calcium concentration is thought to provide a useful diagnostic sign and clues to the efficacy of treatment.) Published reports show that healthy people have mean plasma calcium concentrations that average 2.5 mmol/l . Is the mean calcium level in the Everley's syndrome patients abnormally high? (Source: http://bmj.bmjjournals.com/collections/statsbk/7.shtml).

Key, odd                          Key, even (may not be posted)