11: Variances and means Version: 2/8/07

Review Questions 

  1. Provide synonyms for the word variance. What symbol is used to denote the population variance? What symbol is used to denote the sample variance?
  2. Provide a synonym for the term standard deviation. What symbol is used to denote the population standard deviation? What symbol is used to denote the sample standard deviation?
  3. When will 95% of values lie within 2 standard deviations of the mean?
  4. What is the name of the rule that states "at least 75% of values lie within 2 standard deviations of the mean."
  5. Name two measures of spread other than the variance and standard deviation. 
  6. How do you use a boxplot to assess variance? 
  7. The sum of squares is the sum of the squared distances of data points around the group's  ___________.
  8. State the  relation between SS and variance. 
  9. What procedure tests for inequality of population variances?
  10. Why not pool variances when the F ratio test is significant?
  11. T/F: The standard error of the mean difference is a measure of spread.
  12. When pooling variances (for Student's t procedures), n1 = 11 and n2 = 10. Then, df1 = ____, df2 = ____, and df = ____.
  13. Suppose tstat =  2.96 with 16 degrees of freedom. Is the two-sided p value for this problem less than 0.05?  Is it less than 0.01?
  14. What is the value of the t quantile used to calculate a 95% confidence interval formula for  µ1 - µ2 when n1 = 10 and n2 = 8? (Equal variance method.)
  15. Using statistical notation, write the null and alternative hypotheses for the F-ratio test.
  16. In words, write the null and alternative hypotheses for the F ratio test.
  17. Using statistical notation, write the two-sided null hypotheses for an independent t test.
  18. What symbol is used to denote the independent mean difference in the population? . . . in the sample?
  19. List ways to  compare group variability.
  20. List ways to compare group averages. 
  21. What is the name of the t test that makes no assumptions about the equality of population variances? 

Exercises

11.1 Comparing means depends on within group variability. Whether an observed difference in means is surprising depends on the variance within groups. This makes sense when one considers that it is more likely differences will arise by chance when  individuals within groups vary greatly. This is why we take variance into account when comparing means. Consider the stemplots below. In both comparisons, group 1 has a mean of 70 and group 2 has a mean of 50. However, we are confident the difference observed in Comparison B is real, while the observed difference in Comparison A might be due to chance fluctuation. Conduct t tests (for both comparisons) to confirm this suspicion. 

Comparison A

Group 1       Group 2
        0|9|
        0|8|
        0|7|0
        0|6|0
        0|5|0
         |4|0
         |3|0
         |2|
         |1|
        (x10)

Comparison B

Group 1       Group 2
         |9|
        0|8|
      000|7|
        0|6|0
         |5|000
         |4|0
         |3|
         |2|
         |1|
        (x10)

11.2 Leaves on a common stem. Plot the data sets listed below as side-by-side stemplots. (Separate side-by-side boxplots for each comparison). Based on these plots, compare group means and variances. (Calculations not necessary.) 

Comparison A:   Group 1: 90, 70, 50, 30, 10    Group 2: 70, 60, 50, 40, 30 
Comparison B:   Group 1: 90, 80, 70, 60, 50    Group 2: 70, 60, 50, 40, 30
Comparison C:   Group 1: 90, 70, 50, 30, 10    Group 2: 90, 80, 70, 60, 50

11.Linoleic acid and LDL cholesterol A study tested the cholesterol-lowering potential of dietary linoleic acid in mildly hypercholesterolemia subjects (Rassias et al., 1990). Plasma cholesterol (mmol/m3) in  twelve mildly hypercholesterolemic subjects were: 

6.0 6.4 7.0 5.8 6.0 5.8 5.9 6.7 6.1  6.5 6.3  5.8

A different (fcctitious) group of had the following values:  

6.4 5.4  5.6 5.0 4.0 4.5  6.0

Data are stored online in rassias.sav.

(A) Compare the groups with stemplots on a common stem. Discuss your findings.
(B) The mean of group 1 = 6.192 mmol/m3. Its standard deviation = 0.392 mmol/m3.  Calculate by hand the mean and standard deviation of group 2.
(C) Test the variances for inequality with an F ratio test.
(D) Test the means with a t test.
(E) Summarize your analysis.

11.4 Particulate matter in air samples In a study of air pollution, investigators measured suspended particulate matter (µgms/m3) in air samples at two sites Data are shown below and are stored online in airsamples.sav.

Site 1: 68   22   36   32   42   24   28   38
Site 2: 36   38   39   40   36   34   33   32

(A) Create stemplots on a common stem to compare these two distributions. Use split stem-values for your plot. Discuss your findings.
(B) Calculate the means and standard deviations of the data from the two sites. How do these summary statistics complement your stemplot analysis?  
(C) Test the variances for inequality with an F ratio test. Include a statement of the null hypothesis. 
(D) Would you use a pooled (equal variance) t procedure to test of means? Explain your response. 
(E) Test the means for inequality. Show all hypothesis testing steps.
(F) What is the most important finding in this analysis? Was the test of means revealing or obscuring? 

11.5 Body weight and pituitary adenoma. The standard deviation of body weights in n = 12 patients with pituitary adenomas is 21.4 kg. The standard deviation  in a control group (n = 5) is 12.4 kg (Daniel, 1999, p. 251). Calculate an F ratio statistic to test the variances for inequality. Show all hypothesis testing steps. 

11.6 Anxiety during hemodialysis. Severe anxiety often accompanies chronic hemodialysis. To help counteract this anxiety, a set of progressive relaxation exercises was shown on videotape to a group of 38 hemodialysis patients. A control group of 23 patients viewed a set of neutral videotapes. Following these interventions, the State-Trait Anxiety Inventory questionnaire was administered to both groups. The treatment group had a the mean anxiety score of 33.42 (standard deviation = 10.18). The control group had a mean score of 39.71 (standard deviation = 9.16) (Alarcon, 1982). 

(A) Test the variances for a significant difference. Show all steps of the procedure. Remember to interpret your results.
(B) Now test the difference in the means for significance. Again, show all steps in the procedure. 
(C) For question B, did you use an equal variance or unequal variance t procedure? Justify use of the procedure that you did use.

11.7 Heart size and congestive heart failure. The total heart weights (grams) and body weights (kilograms ) for two groups of cadavers are stored in heartwt.sav. Group 1 died of congestive heart failure. Group 2 died of other problems. Data are also shown below (Rosner, 1990, p. 35). We want to know whether group 1 demonstrates significantly greater variability than group 2. Compare the groups with side-by-side boxplots. Which group has higher values on the average? Which has greater variability? What "pops out"? Test the variances for a significant difference. Test the difference in means for significance. Discuss your findings. 

Total heart weight (grams)
Group 1 (heart failure) 450 760 325 495 285 450 460 375 310 615 425
Group 2 (controls) 245 350 340 300 310 270 300 360 405 290  

11.Body weights of cadavers with and without heart failure (Rosner, 1990, p. 35). Return to the "heart size" data set introduced in the prior exercise. Now consider body weights of subjects. Data are listed below and are stored in the variable BW in the file heartwt.sav.

(A) Using SPSS, explore the groups with side-by-side plots (of your choice)
(B) Test variances for equality using either an F ratio test or Levene's test. Show all hypothesis testing steps. 
(C)  Test H0: µ1 = µ2. Would you use an unequal variance or equal variance t procedure in this instance? 

Body weight (kgs)
Group 1 (heart failure) 54.6 73.5 50.3 44.6 58.1 61.3 75.3 41.1 51.5 41.7 59.7
Group 2 (controls) 40.8 67.4 53.3 62.2 65.5 47.5 51.2 74.9 59 40.5

11.9 Efficacy of echinacea in treating upper respiratory infections (severity of symptoms). A randomized, double-blind, placebo-controlled trial evaluated the herbal remedy Echinacea purpurea in treating upper respiratory tract infections in 2- to 11-year-old children. Each time a child had an upper respiratory tract infection, treatment with either echinacea or a placebo was given for the duration of the illness. Among the outcomes studied was "severity of symptoms." A severity rating based on 4 symptoms monitored by the parents of subjects was recorded for each of the subjects. The peak severity of symptoms in the 337 cases treated with Echinacea averaged a score 6.0 (standard deviation 2.3). The peak severity of symptoms in the placebo group (n2 = 370) averaged 6.1 (standard deviation 2.4) (Taylor et al., 2003). Test the mean difference for significance. Discuss your findings. 

11.10 Efficacy of echinacea in treating upper respiratory infections (duration of symptoms). The echinacea study introduced in the prior exercise also measured the duration of peak  symptoms in study subjects. The treatment group (n = 337) had a mean duration of 1.60 days (standard deviation 0.98 days). The control group (n = 370) had a mean duration of peak symptoms of 1.64 days (standard deviation 1.14 days.) (Taylor et al., 2003). Test the mean difference for significance.

11.11 The effect of calcium supplementation on blood pressure (Lyle et al., 1987). A randomized, double-blind, placebo-controlled trial was conducted to examine the effects of calcium supplementation on blood pressure in normotensive men between the ages of 19 to 52 years. After establishing a baseline blood pressure measurement, subjects were assigned to either a treatment group or placebo group. The treatment group was supplemented with 1500 milligrams of calcium per day for a 12-week period. Systolic blood pressure measurements taken in the seated position at the end of 12-week period were taken. Data for the African-American men in the study are shown below and can be downloaded by right-clicking Lyle1987.sav. Create side-by-side boxplots comparing the two groups. Discuss your findings. 

SUBJECT

GROUP
1 = treatment
2 = placebo

BEFORE
mm Hg

AFTER
mm Hg

DELTA
Decrease

1

1

107

100

7

2

1

110

114

−4

3

1

123

105

18

4

1

129

112

17

5

1

112

115

−3

6

1

111

116

−5

7

1

107

106

1

8

1

112

102

10

9

1

136

125

11

10

1

102

104

−2

11

2

123

124

−1

12

2

109

97

12

13

2

112

113

−1

14

2

102

105

−3

15

2

98

95

3

16

2

114

119

−5

17

2

119

114

5

18

2

114

112

2

19

2

110

121

−11

20

2

117

118

−1

21

2

130

133

−3

11.12 The effect of calcium supplementation on blood pressure. For the data in Exercise 11.11, test the decreases in blood pressure for a significant difference. 

 

Key to Odd Numbered Problems                           Key to Even Numbered Problems (may not be posted)