8: Key, Odd
Review Questions
- Data points in independent samples are unrelated. You should think of
these as SRSs from separate populations. In contrast, each data point in a paired
sample is uniquely matched to a data point in the other paired sample.
- Use one variable to store data for the response (outcome) variable and a separate variable to store group
information.
- Stemplots on a common stem; side-by-side boxplots. [Techniques not
covered: mean ± standard deviation plot; mean ± standard error plot;
dotplots.]
-
1
-
2
- df = df1
+ df2 = (n1 - 1) + (n2 - 1)
= n1 + n2 - 2
- There is no difference in the population means. Or, the mean in population 1 is equal to the mean in population 2.
- H0: µ1 - µ2 =
0. Or, H0: µ1 = µ2
- (a) the standard deviation of the variable, (2) difference worth
detectiing, (3) desired alpha, and (4) desired power.
- independent samples, Normality, equal variance
- True. Population do not need to be Normal. The sampling distribution of
the mean (SDM), however, need to be Normal.
-
The 95% confidence of the interval refers to the procedure used to draw the confidence
interval, and not to the confidence interval once it is drawn. Once the confidence interval
is drawn, there is either a 100% or 0% is contains the true mean difference.
Exercises
8.1 Sampling designs.
(A) Autism.
Independent samples (B) Husbands and wives. Paired samples (C) Nutritional knowledge.
A single sample
8.3 Facetious study
(A) Group 1 mean = 98. Group 2 mean = 108. Independent
samples.
(B) Mean change in group 1 = 4. Paired samples.
(C) Mean change within group 2 is equal to 1. This is a paired difference.
(D) Group 1 had a greater mean change (4 vs. 1). This is an independent
comparison of the two groups.
8.5 Large t statistic. When the sample is more than just
several observations, the t distribution is almost identical to a a z
distribution.
Based on the 68-95-99.7 rule, we know that we would almost never get a
statistic this far from 0 on a z distribution. Therefore, this t statistic will be in the far extent of the right-hand
tail and the P value will be very very small (surely less than 0.01). This indicates
a highly significant result.
8.7 Bone density in newborns. Data are bone mineral density
measures in grams / centimeter³.
The 95% confidence interval for mean difference is based on:
- df1 = 77 - 1 = 76; df2 = 161 - 1 = 160; df = 76 + 160 = 236. With this many
df, we
can use the z value of 1.96 for 95%
confidence.
-
s2p = (76)(.0262) +
(160)(.0252) / 236 = 0.000641
- SExbar1-xbar2 = sqrt [.000641 × (1/77 + 1/161)] = 0.00351
-
95% confidence interval for µ1 - µ2 =
(0.098 - 0.095) ± (1.96)(0.00351) =
0.003 ± 0.007 = (-0.004, 0.010)
-
Interpretation: We can be 95% confidence that the mean difference in the
population is between -0.004 and 0.010. Our confidence is in the
method used to calculate the interval, and not in this particular result as
such.
8.9 Cytomegalovirus and coronary stenosis. Data
represent luminal diameter reduction (mm) of coronary arteries over 6 months
of follow-up.
-
Step A (Hypotheses): H0: µ1 = µ2
vs. H1: µ1
µ2
-
Step B (Statistics): df1 = 49 - 1 = 48, df2 = 26 - 1 = 25, df = 48 + 25 = 73;
s2p = [(48)(0.832) + (25)(0.692)] /
73 = 0.616; SEmean dif = sqrt[0.616 × (1/49 + 1/26)] = 0.1904
tstat = (1.24 - 0.68) / (0.1904) =
2.94
- Step C (P value): Two-sided P is between 0.01 and 0.002
- Step D ("Significance conclusion"): , The difference is
highly significant (reject H0). Data support the theory that CMV plays a role in coronary
restenosis.
8.11 Pregnancy-induced hypertension and aspirin
-
H0: µ1 = µ2
vs. H1: µ1
µ2
-
df1 = 22, df2 = 23, df = 45
-
s2pooled = [(22)(82) + (23)(82)] /
45 = 64 [The pooled
estimate of variance is a weighted average of the two sample variances. Both samples have a variance of 82,
so the pooled variance is also 64.]
-
semeandif = sqrt[64 × (1/23 + 1/24)] = 2.334
-
tstat = (111 - 109) / 2.334 = 0.86
- The two-tailed P-value is between 0.30 and 0.40.
- The difference is not
significant (do not reject H0)
8.13 Risk Taking
Group 1 (Girls): 5-point summary: 72, 86, 95, 97, 125
IQR = 97 - 86 = 11
FU = 97 + (1.5)(11) = 113.5; 125 is outside; upper inside value is 99
FL = 86 - (1.5)(11) = 69.5; no lower outside values
Group 2 (Boys):
5-point summary: 89, 93, 105.5, 126, 130
IQR = 129 - 93 = 33
FU = 126 + (1.5)(33) = 175.5; no upper outside values
FL = 93 - (1.5)(33) = 43.5; no lower outside values
Interpretation:
-
Location: Girls have lower scores on average.
-
Shape: The small sample sizes make statements about shape tenuous.
- Group 1 has an upper outside value.
- Spread: Girls tend to have less variability (except for the outlier,
which may need further scrutiny).
|
 |
8.15 Efficacy of echinacea in treating upper respiratory
infections (severity of symptoms). H0: µ1
= µ2: Pooled variance = 5.536; tstat = -0.5644 (
705 df) P-value = 0.5727. The test reveal no significant difference
between the herbal remedy (echinacea) and placebo.
Sample size questions
SS.1. Vegetarians and non-vegetarians.
n = [(16)(402) / 102] + 1 = 257
per group