Binary Outcome, Cohort and Cross-Sectional Samples (RRs)

Introduction | Confidence Interval | p Values | Power and Sample Size | Exercises

Introduction

We consider two independent groups derived by either a cohort or cross-sectional sample. One group is exposed to a factor while the other is nonexposed. Each individual is classified as diseased or not diseased according to defined criteria. Data are cross-tabulated with cells labeled as follows:

 

Disease+ Disease-
Exposure + a b n1
Exposure - c d n2
m1 m2 N

Let p1 represent the proportion in the exposed group (p1 = a / n1) and p2 represent the proportion in the non-exposed group (p2 = c / n2). The ratio of p1 and p2 -- the proportion ratio -- is often referred to as the risk ratio or relative risk:

rr = p1 / p2

Notation: Lower case acronyms denote estimators, while upper case represent parameters. Thus, rr represents the risk ratio estimate and RR denotes the risk ratio parameter.

Illustrative data  TOXIC.REC . As an example we consider a cohort of cancer patients undergoing bone marrow ablation with the drug cytarabine (Jolson, et al, 1992). One group is exposed to (i.e., treated with) a generic drug while the other group uses the innovator manufacturer's product (and are thus nonexposed). Exposure information is stored in the variable GENERIC (exposed: 1 = yes, 2 = no). The disease information denotes cerebellar toxicity as stored in the variable TOX (1= yes, 2 = no). The first three records and last record of the data set are:

REC  GENERIC        TOX
---  ------- -----------
  1        1           1
  2        1           2
  3        1           2
  |        |           |
 59        1           2

Data are cross-tabulated with the command:

EPI6> TABLES <exposure> <disease>

where <exposure> and <disease> represent the names of the exposure and disease variables, respectively.

For example, to cross-tabulate the current data issue the command:

EPI6> TABLES GENERIC TOX

Output is:

                           TOX
GENERIC    |          1          2        Total
-----------+-----------------------------+------
         1 |         11         14       | 25
         2 |          3         31       | 34
-----------+-----------------------------+------
     Total           14         45         59

Thus, the incidence of toxicity in the exposed group (p1) = 11 / 25 = 0.440, the incidence in the unexposed group (p2) = 3 / 34 = 0.088, and rr = 0.440 / 0.088 = 4.99 @ 5.0, indicating that toxicity was 5 times more frequent in the exposed group than in the non-exposed group.

Confidence Interval for RRs

Risk ratio estimates are printed in the output below the 2-by-2 table. For the illustrative example: 

RISK RATIO(RR)(Outcome:TOX=1;   Exposure:GENERIC=1)                     4.99
95% confidence limits for RR                              1.55 < RR <  16.03

Thus, the point estimate is 5.0 (95% CI : 1.6, 16.0). This interval locates the RR parameter with 95% confidence. (See pp. 237 - 243 of Epidemiology Kept Simple for additional information.) 

The confidence interval assumes data are free of biases. Since this is unrealisticthe confidence interval should be viewed as a rough estimate of the parameter. 

p Values

Chi-Square Test

Epi Info calculates three different chi-squared statistics to test H0: RR = 1. These are:

                         Chi-Squares   P-values
                         -----------   --------
        Uncorrected:         9.85     0.00169835
        Mantel-Haenszel:     9.68     0.00185979
        Yates corrected:     8.00     0.00467202

Each of the above chi-square statistics is associated with 1 degree of freedom. Statisticians do not agree on which of the above chi-square statistics is superior. In general, the Yates corrected chi-square is the most conservative  (i.e., provides the highest p value) and the uncorrected is most liberal (i.e., provides the lowest p value). I recommend that the p values be reported to two significant digits (e.g., p = .0017 via the uncorrected c2).

Interpretation: Some statisticians use benchmarks to help interpret the p value. Benchmarks of .05 or .01 are common. Thus, if p < .01 the association is declared significant (i.e., not easily explained by chance). Thus, each of the above p values provides evidence against H0. More importantly, the p value should NOT be interpreted in isolation -- it should be interpreted in light of other evidence (Fisher, 1935).

Assumptions: Chi-square tests, assume data are valid (no information bias, no selection bias, no confounding). They also assume sampling independence and expected frequencies greater than or equal to 5. When an expected frequency in the cross-tabulation is less than 5, Epi Info issues the warning: An expected value is less than 5; recommend Fisher exact results

Fisher's Exact Test

Fisher's exact test is based on summing exact binomial probabilities for permutations that are equally or more extreme than observed results, assuming the null hypothesis is true and the table's margins are fixed. This procedure is explained in Rosner, 1995, p. 376.

Illustrative Data. To illustrate Fisher's test, let us consider a study performed to explore the relation between a drug called Kayexelate(R) and the occurrence of colonic necrosis in post-operative patients (Gerstman et al., 1992). This study compares colonic necrosis rates in postoperatively exposed- and non-patients. Data are stored in KX-NECRO.ZIP in KX-NECRO.REC as variables KX (exposed to Kayexelate: Y/N) and NECRO (colonic necrosis: Y/N).

Data are processed with the commands:

EPI6> READ KX-NECRO
EPI6> TABLES KX NECRO

Output is:

                  NECRO
KX         |     +     - | Total
-----------+-------------+------
         + |     2   115 |   117
         - |     0   862 |   862
-----------+-------------+------
     Total |     2   977 |   979

Results show 2 of the 117 Kayexelate-exposed patients experienced colonic necrosis. In contrast, 0 of 862 non-exposed patients experienced colonic necrosis . Thus, p1 = 2 / 117 = 1.7% and p2 = 0 / 862 = 0.0%. The risk ratio = 1.7% / 0.0% = undefined, with a limit of positive infinity.

EpiInfo is unable to calculate a confidence interval for these data but tests H0: RR = 1 with Fisher's test:

        Fisher exact: 1-tailed P-value: 0.0141750 <---
                      2-tailed P-value: 0.0141750 <---

Comment: Like all statistical tests, Fisher's procedure assumes perfect validity (no confounding, no information bias, no selection bias). It also assumes sampling independence.

Power and Sample Size

Power

The power and precision of inferences depends on the number of subjects in the exposed group (n1), the ratio of non-exposed to exposed subjects (n2 / n1), the RR "worth detecting," the incidence in the non-exposed population (p2), and the alpha level of the inference (a). We may use the program EpiTable > Sample > Power calculation > Cohort Study to perform power computations (method based on Fleiss, 1981, pp. 44 - 45).

Illustrative example. A study has 100 exposed subjects, 100 unexposed subjects (allocation ratio = 100/100 = 1), an expected incidence of 10% in the unexposed group, an a level of 0.05, and an expected RR of 2. Based on these assumptions, EpiTable calculates power = 42.4%. This is considered inadequate. (Power should be at least 80%, preferably 90%.)

Comment: In determining sample size requirements, the investigator must have some idea of the order of magnitude of proportions he or she is looking for. This knowledge might come from previous research, from an accumulation of clinical experience, from small-scale pilot work, or from readily available sources of statistics (e.g., morbidity surveys). Given at least some information, the investigator can, using his or her imagination and expertise, come up with an estimate of a difference between two proportions that is scientifically or clinically important. Given no information, the investigator has no basis for designing the study intelligently and would be hard put to justify designing it at all (paraphrased from Fleiss, 1981, p. 34) .

Sample Size Requirements

Sample size calculations can be viewed as "power calculations in reverse." Here, we specify the required power (or precision) to derive a reasonable estimate of the sample size required for a given study. We will use the program EpiTable > Sample > Sample Size > Cohort Study for sample size requirement calculations.

Illustrative example. To achieve 80% power to detect a RR of 2 in a study with an allocation ratio of 1:1 non-exposed to exposed subjects, and an expected incidence in the non-exposed group of 10%, with 1 - a = .95, EpiTable determines n1 = n2 = 219. To achieve 80% power to detect a RR of 3, we need n1 = n2 = 72.

Exercises

(1) EAR.ZIP: Otitis Media Clinical Trial (Source of data: Rosner, 1990, p. 68,). Data are from a clinical trial on the treatment of acute otitis media in children. Group 1 received a 14-day trial of cefaclor. Group 2 received a 14-day trial of amoxicillin. This information is contained in the variable called AB (1 = cefaclor, 2 = amoxicillin). A total of 278 infected-ears were treated, with clearance of infection represented in variable CLEAR (1 = yes, 2 = no). Download the data set and then perform each of the following analyses:
(A) Calculate the incidence of clearance associated with each of the antibiotics. Include a 95% confidence interval for the RR.
(B) Test the risk ratio for significance. Report relevant hypotheses testing steps.
(C) Briefly summarize your findings.

(2) PRISON.ZIP: Human Immunodeficiency Virus Infection in a Women's Correctional Institution (Smith et al., 1991). A study of HIV infection in women entering the New York State Prison system cross-classified 465 inmates with respect to HIV sero-positivity (HIV) and history of intravenous drug use (IVDU). Download this data set and then calculate the prevalence of HIV in each exposure group. Calculate the prevalence ratio. Include a 95% confidence interval. Interpret your findings.

(3) LABOR.ZIP: Induction of Labor and Meconium Staining. Induced labor (by administering pitocin and other hormones) in near-term pregnancies is a common obstetrical procedure which is intended to reduce the risk of complications. Meconium staining during childbirth is a sign of fetal distress. Use the data LABOR.REC to determine whether there is an association between induction (INDUCE) and meconium staining (MECON). Include relevant descriptive and inferential statistics, and summarize your findings in plain English.

(4) OSWEGO.ZIP: Food Poisoning in Oswego, New York (Centers for Disease Control, 1992). Data from an outbreak of gastrointestinal illness following a church supper in upstate New York are reported in OSWEGO.REC. Variables in the data set are self-explanatory (use the VARIABLES command to see variable names). Based on these data, fill in the table below and determine the most likely source of agent.
Food Ate Food Did Not Eat Food Risk Ratio  95% conf. int. p*
Ill Total % Ill Total %
Baked Ham 29 46 63.0% 17 29 58.6% 1.1 0.7 - 1.6 .70
Spinach ___ ___ ___ ___ ___ ___ ___ ___ ___
Mashed Potato ___ ___ ___ ___ ___ ___ ___ ___ ___
Cabbage Salad ___ ___ ___ ___ ___ ___ ___ ___ ___
Jell-O ___ ___ ___ ___ ___ ___ ___ ___ ___
Rolls ___ ___ ___ ___ ___ ___ ___ ___ ___
Brown bread ___ ___ ___ ___ ___ ___ ___ ___ ___
Milk ___ ___ ___ ___ ___ ___ ___ ___ ___
Coffee ___ ___ ___ ___ ___ ___ ___ ___ ___
Water ___ ___ ___ ___ ___ ___ ___ ___ ___
Cakes ___ ___ ___ ___ ___ ___ ___ ___ ___
Van. ice cream ___ ___ ___ ___ ___ ___ ___ ___ ___
Choc. ice cream ___ ___ ___ ___ ___ ___ ___ ___ ___
Fruit salad ___ ___ ___ ___ ___ ___ ___ ___ ___

* uncorrected chi-square or Fisher's exact test, as appropriate.

(5) RESTENOS: Restenosis Following Coronary Atherectomy (Zhou et al., 1996). Each year, cardiologists open many clogged arteries only to have these same arteries restenose following surgery. A study sponsored by the NIH / Heart, Lung and Blood Institute was performed to determine whether silent infection with a common virus (cytomegalovirus) was predictive of the regrowth of arterial plaque. In 21 of the 49 patients with serologic evidence of cytomegalovirus infection, regrowth of arterial plaque was noted. In contrast, 2 of the 26 patients without serologic evidence of cytomegalovirus had plaque regrowth. Construct a 2-by-2 table for these data. Then calculate the risk ratio associated with cytomegalovirus infection. Include a 95% confidence interval. (You may use an epidemiological calculator such as EpiInfo > STATCALC for your calculation.) Do data support the theory that subclinical viral infections may play a role in arteriosclerosis?

(6) PHENFORM: Phenformin and Cardiovascular Death (Osborn, 1979). In a clinical trial, 26 out 204 patients treated with phenformin died of cardiovascular disease. In contrast, 2 of 64 control patients died of cardiovascular disease. Calculate the incidence of cardiovascular death in each group and then calculate the risk ratio associated with phenformin. Include a 95% confidence interval for the RR. In plain English, interpret your results.

(7) SIZE-COH: Power and Sample Size Exercises.
(A) Suppose you want to complete a study with a = 0.05; power = 0.8; allocation ratio = 1:1, and background rate (p2) of 25%. What size sample is needed to detect RR = 2? RR = 3? RR = 4?
(B) What is the power of a study looking for RR = 2, assuming n1 = 50, n2 = 100, p2 = 5%, and a = 0.05. What if the true RR = 2? What if RR = 3?

(8) BI-HELM1.ZIP: Bicycle Helmet Use in Two Northern California Counties (Perales et al., 1994). In 1991, 1491 bicyclists were hospitalized for head injuries in California. BI-HELM1.REC contains bicycle helmet use data for 1651 bicycle riders in two northern California counties. Data can be downloaded by clicking on the highlighted file name. A code book is included in the ZIP file. After downloading this data set, calculate the helmet-use rate in the Santa Clara County and in Contra Costa County. (Report relevant counts and proportions.) Report the incidence ratio. Include a 95% confidence interval, and interpret your results.

(9) OC/MI. A study was conducted to look at the effects of oral contraceptive use (OC) on heart disease in women 40- to 44-years of age. Thirteen incident myocardial infarctions (MI) were found in 5000 current OC users during 3-years of observation. In contrast, 7 cases were seen in 10,000 non-users. Compare the incidence of MI in the groups using methods learned in this unit. Make certain to summarize your results in plain language.

Key

References

Fisher, R. A. (1935). The logic of inductive inference. Journal of the Royal Statistical Society, 98, 39-54.

Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. Second Edition. New York: John Wiley & Sons.

Greenland, S., & Robins, J. M. (1985). Estimation of a common effect parameter from sparse follow-up data. Biometrics, 41(1), 55-68.

Osborn, J. F. (1979). Statistical Exercises in Medical Research. New York: John Wiley & Sons.

Rosner, B. (1990). Fundamentals of Biostatistics ( Third ed.). Belmont, CA: Duxbury Press.

Rothman, K. J., & Greenland, S. (1998). Modern Epidemiology ( Second ed.). Philadelphia: Lippincott-Raven.

Smith, P. F., Mikl, J., Truman, B. I., Lessner, L., Lehman, J. S., Stevens, R. W., Lord, E. A., Broaddus, R. K., & Morse, D. L. (1991). HIV infection among women entering the New York State correctional system. Am J Public Health, 81 Suppl, 35-40.

Zhou, Y. F., Leon, M. B., Waclawiw, M. A., Popma, J. J., Yu, Z. X., Finkel, T., & Epstein, S. E. (1996). Association between prior cytomegalovirus infection and the risk of restenosis after coronary atherectomy. N Engl J Med, 335(9), 624-630.