Introduction | Confidence Interval | p Values | Power and Sample Size | Exercises
We consider two independent groups derived by either a cohort or cross-sectional sample. One group is exposed to a factor while the other is nonexposed. Each individual is classified as diseased or not diseased according to defined criteria. Data are cross-tabulated with cells labeled as follows:
Let p1 represent the proportion in the exposed group (p1 = a / n1) and p2 represent the proportion in the non-exposed group (p2 = c / n2). The ratio of p1 and p2 -- the proportion ratio -- is often referred to as the risk ratio or relative risk:
rr = p1 / p2
Notation: Lower case acronyms denote estimators, while upper case represent parameters. Thus, rr represents the risk ratio estimate and RR denotes the risk ratio parameter.
Illustrative data TOXIC.REC . As an example we consider a cohort of cancer patients undergoing bone marrow ablation with the drug cytarabine (Jolson, et al, 1992). One group is exposed to (i.e., treated with) a generic drug while the other group uses the innovator manufacturer's product (and are thus nonexposed). Exposure information is stored in the variable GENERIC (exposed: 1 = yes, 2 = no). The disease information denotes cerebellar toxicity as stored in the variable TOX (1= yes, 2 = no). The first three records and last record of the data set are:
REC GENERIC TOX
--- ------- -----------
1 1 1
2 1 2
3 1 2
| | |
59 1 2
Data are cross-tabulated with the command:
EPI6> TABLES <exposure> <disease>
where <exposure> and <disease> represent the names of the exposure and disease variables, respectively.
For example, to cross-tabulate the current data issue the command:
EPI6> TABLES GENERIC TOX
GENERIC | 1 2 Total
1 | 11 14 | 25
2 | 3 31 | 34
Total 14 45 59
Thus, the incidence of toxicity in the exposed group (p1) = 11 / 25 = 0.440, the incidence in the unexposed group (p2) = 3 / 34 = 0.088, and rr = 0.440 / 0.088 = 4.99 @ 5.0, indicating that toxicity was 5 times more frequent in the exposed group than in the non-exposed group.
Risk ratio estimates are printed in the output below the 2-by-2 table. For the illustrative example:
RISK RATIO(RR)(Outcome:TOX=1; Exposure:GENERIC=1) 4.99
95% confidence limits for RR 1.55 < RR < 16.03
The confidence interval assumes data are free of biases. Since this is unrealisticthe confidence interval should be viewed as a rough estimate of the parameter.
Epi Info calculates three different chi-squared statistics to test H0: RR = 1. These are:
Uncorrected: 9.85 0.00169835
Mantel-Haenszel: 9.68 0.00185979
Yates corrected: 8.00 0.00467202
Interpretation: Some statisticians use benchmarks to help interpret the p value. Benchmarks of .05 or .01 are common. Thus, if p < .01 the association is declared significant (i.e., not easily explained by chance). Thus, each of the above p values provides evidence against H0. More importantly, the p value should NOT be interpreted in isolation -- it should be interpreted in light of other evidence (Fisher, 1935).
Assumptions: Chi-square tests, assume data are valid (no information bias, no selection bias, no confounding). They also assume sampling independence and expected frequencies greater than or equal to 5. When an expected frequency in the cross-tabulation is less than 5, Epi Info issues the warning: An expected value is less than 5; recommend Fisher exact results
Fisher's exact test is based on summing exact binomial probabilities for permutations that are equally or more extreme than observed results, assuming the null hypothesis is true and the table's margins are fixed. This procedure is explained in Rosner, 1995, p. 376.
Illustrative Data. To illustrate Fisher's test, let us consider a study performed to explore the relation between a drug called Kayexelate(R) and the occurrence of colonic necrosis in post-operative patients (Gerstman et al., 1992). This study compares colonic necrosis rates in postoperatively exposed- and non-patients. Data are stored in KX-NECRO.ZIP in KX-NECRO.REC as variables KX (exposed to Kayexelate: Y/N) and NECRO (colonic necrosis: Y/N).
Data are processed with the commands:
EPI6> READ KX-NECRO
EPI6> TABLES KX NECRO
KX | + - | Total
+ | 2 115 | 117
- | 0 862 | 862
Total | 2 977 | 979
Results show 2 of the 117 Kayexelate-exposed patients experienced colonic necrosis. In contrast, 0 of 862 non-exposed patients experienced colonic necrosis . Thus, p1 = 2 / 117 = 1.7% and p2 = 0 / 862 = 0.0%. The risk ratio = 1.7% / 0.0% = undefined, with a limit of positive infinity.
EpiInfo is unable to calculate a confidence interval for these data but tests H0: RR = 1 with Fisher's test:
Fisher exact: 1-tailed P-value: 0.0141750 <---
2-tailed P-value: 0.0141750 <---
Comment: Like all statistical tests, Fisher's procedure assumes perfect validity (no confounding, no information bias, no selection bias). It also assumes sampling independence.
The power and precision of inferences depends on the number of subjects in the exposed group (n1), the ratio of non-exposed to exposed subjects (n2 / n1), the RR "worth detecting," the incidence in the non-exposed population (p2), and the alpha level of the inference (a). We may use the program EpiTable > Sample > Power calculation > Cohort Study to perform power computations (method based on Fleiss, 1981, pp. 44 - 45).
Illustrative example. A study has 100 exposed subjects, 100 unexposed subjects (allocation ratio = 100/100 = 1), an expected incidence of 10% in the unexposed group, an a level of 0.05, and an expected RR of 2. Based on these assumptions, EpiTable calculates power = 42.4%. This is considered inadequate. (Power should be at least 80%, preferably 90%.)
Comment: In determining sample size requirements, the investigator must have some idea of the order of magnitude of proportions he or she is looking for. This knowledge might come from previous research, from an accumulation of clinical experience, from small-scale pilot work, or from readily available sources of statistics (e.g., morbidity surveys). Given at least some information, the investigator can, using his or her imagination and expertise, come up with an estimate of a difference between two proportions that is scientifically or clinically important. Given no information, the investigator has no basis for designing the study intelligently and would be hard put to justify designing it at all (paraphrased from Fleiss, 1981, p. 34) .
Sample size calculations can be viewed as "power calculations in reverse." Here, we specify the required power (or precision) to derive a reasonable estimate of the sample size required for a given study. We will use the program EpiTable > Sample > Sample Size > Cohort Study for sample size requirement calculations.
Illustrative example. To achieve 80% power to detect a RR of 2 in a study with an allocation ratio of 1:1 non-exposed to exposed subjects, and an expected incidence in the non-exposed group of 10%, with 1 - a = .95, EpiTable determines n1 = n2 = 219. To achieve 80% power to detect a RR of 3, we need n1 = n2 = 72.
(1) EAR.ZIP: Otitis Media Clinical Trial (Source of data: Rosner, 1990, p. 68,). Data are from a clinical trial on the treatment of acute
otitis media in children. Group 1 received a 14-day trial of cefaclor. Group 2 received a 14-day trial of amoxicillin. This information is
contained in the variable called AB (1 = cefaclor, 2 = amoxicillin). A total of 278 infected-ears were treated, with clearance of infection
represented in variable CLEAR (1 = yes, 2 = no). Download the data set and then perform each of the following analyses:
(A) Calculate the incidence of clearance associated with each of the antibiotics. Include a 95% confidence interval for the RR.
(B) Test the risk ratio for significance. Report relevant hypotheses testing steps.
(C) Briefly summarize your findings.
(2) PRISON.ZIP: Human Immunodeficiency Virus Infection in a Women's Correctional Institution (Smith et al., 1991). A study of HIV infection in women entering the New York State Prison system cross-classified 465 inmates with respect to HIV sero-positivity (HIV) and history of intravenous drug use (IVDU). Download this data set and then calculate the prevalence of HIV in each exposure group. Calculate the prevalence ratio. Include a 95% confidence interval. Interpret your findings.
(3) LABOR.ZIP: Induction of Labor and Meconium Staining. Induced labor (by administering pitocin and other hormones) in near-term pregnancies is a common obstetrical procedure which is intended to reduce the risk of complications. Meconium staining during childbirth is a sign of fetal distress. Use the data LABOR.REC to determine whether there is an association between induction (INDUCE) and meconium staining (MECON). Include relevant descriptive and inferential statistics, and summarize your findings in plain English.
(4) OSWEGO.ZIP: Food Poisoning in Oswego, New York (Centers for Disease Control, 1992). Data from an outbreak of gastrointestinal illness following a church supper in upstate New York are reported in OSWEGO.REC. Variables in the data set are self-explanatory (use the VARIABLES command to see variable names). Based on these data, fill in the table below and determine the most likely source of agent.
|Food||Ate Food||Did Not Eat Food||Risk Ratio||95% conf. int.||p*|
|Baked Ham||29||46||63.0%||17||29||58.6%||1.1||0.7 - 1.6||.70|
|Van. ice cream||___||___||___||___||___||___||___||___||___|
|Choc. ice cream||___||___||___||___||___||___||___||___||___|
* uncorrected chi-square or Fisher's exact test, as appropriate.
(5) RESTENOS: Restenosis Following Coronary Atherectomy (Zhou et al., 1996). Each year, cardiologists open many clogged arteries only to have these same arteries restenose following surgery. A study sponsored by the NIH / Heart, Lung and Blood Institute was performed to determine whether silent infection with a common virus (cytomegalovirus) was predictive of the regrowth of arterial plaque. In 21 of the 49 patients with serologic evidence of cytomegalovirus infection, regrowth of arterial plaque was noted. In contrast, 2 of the 26 patients without serologic evidence of cytomegalovirus had plaque regrowth. Construct a 2-by-2 table for these data. Then calculate the risk ratio associated with cytomegalovirus infection. Include a 95% confidence interval. (You may use an epidemiological calculator such as EpiInfo > STATCALC for your calculation.) Do data support the theory that subclinical viral infections may play a role in arteriosclerosis?
(6) PHENFORM: Phenformin and Cardiovascular Death (Osborn, 1979). In a clinical trial, 26 out 204 patients treated with phenformin died of cardiovascular disease. In contrast, 2 of 64 control patients died of cardiovascular disease. Calculate the incidence of cardiovascular death in each group and then calculate the risk ratio associated with phenformin. Include a 95% confidence interval for the RR. In plain English, interpret your results.
(7) SIZE-COH: Power and Sample Size Exercises.
(A) Suppose you want to complete a study with a = 0.05; power = 0.8; allocation ratio = 1:1, and background rate (p2) of 25%. What size sample is needed to detect RR = 2? RR = 3? RR = 4?
(B) What is the power of a study looking for RR = 2, assuming n1 = 50, n2 = 100, p2 = 5%, and a = 0.05. What if the true RR = 2? What if RR = 3?
(8) BI-HELM1.ZIP: Bicycle Helmet Use in Two Northern California Counties (Perales et al., 1994). In 1991, 1491 bicyclists were hospitalized for head injuries in California. BI-HELM1.REC contains bicycle helmet use data for 1651 bicycle riders in two northern California counties. Data can be downloaded by clicking on the highlighted file name. A code book is included in the ZIP file. After downloading this data set, calculate the helmet-use rate in the Santa Clara County and in Contra Costa County. (Report relevant counts and proportions.) Report the incidence ratio. Include a 95% confidence interval, and interpret your results.
(9) OC/MI. A study was conducted to look at the effects of oral contraceptive use (OC) on heart disease in women 40- to 44-years of age. Thirteen incident myocardial infarctions (MI) were found in 5000 current OC users during 3-years of observation. In contrast, 7 cases were seen in 10,000 non-users. Compare the incidence of MI in the groups using methods learned in this unit. Make certain to summarize your results in plain language.
Fisher, R. A. (1935). The logic of inductive inference. Journal of the Royal Statistical Society, 98, 39-54.
Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. Second Edition. New York: John Wiley & Sons.
Greenland, S., & Robins, J. M. (1985). Estimation of a common effect parameter from sparse follow-up data. Biometrics, 41(1), 55-68.
Osborn, J. F. (1979). Statistical Exercises in Medical Research. New York: John Wiley & Sons.
Rosner, B. (1990). Fundamentals of Biostatistics ( Third ed.). Belmont, CA: Duxbury Press.
Rothman, K. J., & Greenland, S. (1998). Modern Epidemiology ( Second ed.). Philadelphia: Lippincott-Raven.
Smith, P. F., Mikl, J., Truman, B. I., Lessner, L., Lehman, J. S., Stevens, R. W., Lord, E. A., Broaddus, R. K., & Morse, D. L. (1991). HIV infection among women entering the New York State correctional system. Am J Public Health, 81 Suppl, 35-40.
Zhou, Y. F., Leon, M. B., Waclawiw, M. A., Popma, J. J., Yu, Z. X., Finkel, T., & Epstein, S. E. (1996). Association between prior cytomegalovirus infection and the risk of restenosis after coronary atherectomy. N Engl J Med, 335(9), 624-630.