16: Relative Risk 4/8/07

Review Questions

  1. Define the term risk. Define relative risk.
  2. The proportion of individuals in the exposed group that experience the outcome divided by the proportion in the nonexposed group is the called the ____________  ___________ (two words).
  3. What symbol is used to represent the risk ratio estimator? What symbol is used to represent its  parameter?
  4. The prevalence of hypertension in a Hispanic group of seniors is 10%. The prevalence in a non-Hispanic group is 5%. How much greater prevalence risk is there in the Hispanic group? 
  5. When will a prevalence ratio equal a risk ratio?
  6. Chi-square test statistics should not be used when an expected frequency in the cross-tabulation is _________________.
  7. What is the name of statistician who invented the exact test for 2-by-2 tables?
  8. Which produces the larger P value: Pearson's uncorrected chi-square or Yates continuity-corrected chi-square test?
  9. Define confounding. 
  10. Explain how an extraneous factor can confound the results of an association.


16.1 Induction of labor [similar 9.5.6]. Meconium staining of the fetus during childbirth is a sign of fetal distress. In a randomized trial, 11 pregnant women had elective induction of labor between 39 and 40 weeks of gestation and 117 control women were managed expectantly until 41 weeks of gestation. One case of meconium staining occurred in the treatment group. Thirteen (13) occurred in the control group (Osborn, 1979, p. 37 cites "M. Sc. Social Medicine, September 1975"; individual data records in labor.sav).

(A) Determine the risks of meconium staining in each group. 
(B) Express the association between induction and meconium staining as a risk ratio. Explain the meaning of the RR to a lay person.
(C) Calculate a 95% confidence interval for the RR. 

16.2 Joseph Lister and anti-septic surgery [similar to 9.5.13]. When Joseph Lister introduced the antiseptic method for surgical operations he demonstrated that post-operative mortality dropped from 16 per 35 procedures to 6 per 40 procedures. Determine the risks of post-operative mortality in each group and determine whether the difference is statistically significant. 

16.3 HIV infection among women entering the New York State Correctional System (similar to lab exercise). A study by Smith et al. (1991) determined the prevalence of HIV-seropositivity  in female prison inmates. Cross-tabulated results by intravenous drug use (IVDU) are shown below. Individual records stored online in PRISON.SAV

IVDU     HIV+   HIV-
+         61     75    136
-         27    312    339
          88    387    475

(A) Calculate the prevalence of HIV in each group. 
(B) Calculate the prevalence ratio associated with intravenous drug use and then interpret your results in plain terms.  
(C) Calculate a 95% confidence interval for the prevalence ratio. Interpret your results. 
(D) Use a continuity correct (Yates's) chi-square test to derive a P value for the association. Show all hypothesis testing. Explain why Fisher's test is unnecessary. 
(E) Replicate the analysis in SPSS.
(F) Suppose you were to plan a study in a prison population to see if ethnic group is  an independent risk factor for HIV. You want to achieve 90% power with alpha = 0.01 (two-sided). We  will use a equal number of study participants in each ethnic group. Determine the number of study participants needed to detect a two-fold difference in prevalence. 
(G) Determine the sample size needed to
detect a 50% increase in risk.

16.4 Treatment of acute otitis media [Similar to 9.5.15]. A trial on the treatment of otitis media studied clearance of infection within 14 days of treatment in two groups. Group 1 received cefaclor and group 2 received amoxicillin (Mandel et al., 1982; entire article). Cross-tabulated data are shown below. Individual records are stored online in the file EAR.SAV

  AB       1      2     TOTAL

  1       89      61      150
  2       56      72      128
TOTAL    145     133      278

(A) Calculate the incidence of the clearance of infection in each ear. [This unit of observation in this analysis is each ear, not each patient. This is open to criticism, but let's go with this for now.]
(B) Calculate the incidence proportion ratio associated with cefaclor. Interpret this statistic.
(C) Calculate a 95% confidence interval for the incidence proportion ratio. Interpret your results.
(D) Conduct a chi-square test. Interpret your test results.
(E) You are planning a study of a new antibiotic with cefaclor as your control group. How large a sample is needed to detect a 25% increase in clearance with alpha = 0.01 with 80% power? You are going to use a 1:1 ratio of sample sizes. Be explicit in your assumptions.

16.5 Cytomegalovirus and coronary restenosis [similar to 9.5.5]. Each year cardiologists perform procedures to blocked coronary arteries only to have may of these repaired arteries re-clog (restenosis) afterwards. A study sponsored by the NIH Heart, Lung and Blood Institute was performed to determine whether prior infection with cytomegalovirus was predictive of arterial restenosis (Zhou et al., 1996). In 21 of the 49 patients with serologic evidence of cytomegalovirus infection, re-growth of arterial plaque was noted. In contrast, only 2 of the 26 patients seronegative patient had restenosis. 

(A) Calculate the risk ratio of restenosis associated with CMV infection. Include a 95% confidence interval. (Always pause to interpret results.)
(B) Conduct a chi-square test of H0: RR = 1.

16.6 UGDP [similar to 9.5.14]. The University Group Diabetes Program assessed the efficacy of various oral hypoglycemic treatments, insulin, and diet in the prevention of vascular complications in diabetics. Unexpectedly, it was found that 26 (13%) out of 204 patients treated with an oral hypoglycemic called phenformin died from cardiovascular disease. In contrast, 2 (3%) of 64 control patients in this arm of the trial died of cardiovascular disease. 

(A) Is an exact procedure (e.g., Fisher's) necessary to test these data, or can you use a chi-square test? Explain your reasoning.
(B) Calculate a P value for the problem and comment on your findings. 

16.7 Oral contraceptives and myocardial infarction [new]. A study was conducted to determine the effect of oral contractive use on heart disease risk in 40- to 44-year old women (fictitious data). This study found 13 new cases among 5000 OC users over 3-years of follow-up. In contrast, among 10,000 non-users, 7 developed a first myocardial infarct. [Data are fictitious but realistic.]

(A) Show data in 2-by-2 cross-tabular form.
(B) Calculate the risk ratio. Include a 95% confidence interval. Show all work, and interpret your results.
(C) Conduct a statistical hypothesis test of association and discuss your results. 

16.8 OSWEGO: An outbreak of gastroenteritis illness following a church supper. Data in the file oswego.sav are from a a food borne disease outbreak case study used by CDC (1992). Briefly, the study involves the investigation of a local health officer in the village of Lycoming, Oswego County, New York. An outbreak of acute gastrointestinal illness involved 46 cases all of which had attended a church supper is reported. Interview about the church supper were completed on 75 of the 80 persons known to be present at the church supper (including the 46 cases). [Optional: A full description of the case study can be downloaded by clicking here.] Download the dataset oswego.sav  [right-click > Save as] and then open it in SPSS. Cross-tabulate the data by case status (variable ILL: 1 = yes, 2 = no) for each of the food item variables in the table below. Calculate risk ratio and P-value for each association, and tally the results in this table: 

Food Ate Food Did Not Eat Food Risk Ratio  95% CI for RR P*
Ill Total % Ill Total %
Baked Ham 29 46 63.0% 17 29 58.6% 1.1 0.7 - 1.6 0.70
Spinach 26 43 ___ 20 32 ___ ___ ___ ___
Mashed P. 23 37 ___ 23 37 ___ ___ ___ ___
Cabbage Sal. 18 28 ___ 28 47 ___ ___ ___ ___
Jell-O 16 23 ___ 30 52 ___ ___ ___ ___
Rolls 21 37 ___ 25 38 ___ ___ ___ ___
Brown bread 18 27 ___ 28 48 ___ ___ ___ ___
Milk 2 4 ___ 44 71 ___ ___ ___ ___
Coffee 19 31 ___ 27 44 ___ ___ ___ ___
Water 13 24 ___ 33 51 ___ ___ ___ ___
Cakes 27 40 ___ 19 35 ___ ___ ___ ___
Van. ice cr. 43 54 ___ 3 21 ___ ___ ___ ___
Choc. ice cr. 25 47 ___ 20 27 ___ ___ ___ ___
Fruit salad 4 6 ___ 42 69 ___ ___ ___ ___

* Pearson uncorrected chi-square test.

16.9 Kayexelate and colonic necrosis. Data from the Kayexelate and colonic necrosis study as described in the StatPrimer notes are shown below and can be downloaded by clicking HERE

Necrosis + Necrosis -

  Generic +

2 115


0 862

(A) Determine the incidence of colonic necrosis in the groups
(B) Calculate expected cell counts. Which test would you use with these data? 
(C) Calculate and report an appropriate P value. 

16.10 Yates, 1934 (2-by2). The following data from Hellman are reported in the classical article by Contingency tables involving small numbers and the c2 test published in Journal of the Royal Statistical Society Suppl., 1, 1934, 217-235 by Frank Yates (p. 230). The frequency of dental malocclusion in infants is cross-tabulated by whether the infant was or wasn't breast-fed. Four of 20 in the breast-fed group (20%) had normal teeth while only 1 of 22 (4.5%) of the non-breast-fed group had normal teeth:


Normal teeth



4 16


1 21

(A) Can you use a chi-square or z test with these data? Explain
(B) Use WinPepi, www.OpenEpi.com, or some other software utility to calculate an exact test for these data. Is the difference in malocclusions statistically significant?

16.11 Yates, 1934  (3-by2). The data considered in the prior exercise also considered a third category: breast and bottle fed. The data for the 3-by-2 table are shown below. 

(A) Can you use a chi-square test of association on these data? Explain.
(B) Use WinPepi > Compare2.exe > Program F1 to calculate a Fisher's or mid-P exact P-value for the data. 

Normal teeth


  Breast fed 

4 16
Bottle fed 1 21

  Brst +bottle

3 47

16.12 Binge drinking on campus by gender (similar to Ex. 9.14). It has been estimated that, overall, 19.4%  of students at 4-year U.S. colleges engage in frequent binge drinking (Wechsler et al., 1994), when "frequent binge drinking" is defined as  having five or more drinks in a row three or more times the prior two week period. Data for men and women separately are:

Freq. binge 

Freq. binge -


1630 5550


1684 8232

(A) Calculate the prevalence ratio ("RR") of binge drinking for males relative to females. Then, fill in this blank: Male have a ____% greater prevalence of binge drinking than females.
(B) Under what conditions will the prevalence ratio approximate the risk ratio?
(C) Calculate a 95% confidence interval for the prevalence ratio.

16.13 Don't sweat the small stuff or P = 0.05. Consider a study in which 40 of 320 individuals (12.5%) in the treatment group experiences an outcome. In contrast, 26 of 336 (7.7%) of  the control group experience the outcome. (Data are shown below.) Calculate chi-square statistics and P values for this problem using both Pearson's and the continuity-corrected (Yates') methods. You may use a software utility such as WinPepi or www.OpenEpi.com for  your calculations. Discuss the results of each test. Is it reasonable to derive different conclusions with the different tests?


Adverse event







40 280 320


26 310 336


66 590 656

16.14. Tobacco use in high school students. Refer back to Exercise 10.1. (This exercise compared salivary cotinine levels in male and female students.)

(A) Calculate the RR of smoking for males. Include a 95% confidence interval. 
(B) Interpret your results and say how these results relate to the chi-square test produced for Exercise 10.1. 

16.15 Do seatbelt laws prevent injury? Refer back to Exercise 10.3. Reclassify the response to either "no injury" or "injury." Display the data in a 2-by-2 table and compare the incidence of "no injury" to injury" in the form of a risk ratio and interpret these results. Include a 95% confidence interval.

16.16.Efficancy of echinacea (reducing the severity of symptoms). Refer back to exercise 10.10. Merge the moderate and severe responses. Then compare the incidence of mild symptoms to moderate/severe symptoms in the form of a risk ratio. Calculate a 95% confidence interval for the RR, and interpret your results. 

16.17 Anger and heart disease (hard outcome, hypertensives). Refer back to Exercise 10.12. Calculate risk ratios for 

(A) moderate-anger vs. low-anger 
(B) high-anger vs. low-anger. 
(C) Does a dose-response pattern emerge? 

16.18        Helicopter evacuation and survival following trauma. Accident victims may be transported to the hospital by helicopters or, more typically, by road ambulance. Does the use of helicopters actually save lives? This exercise compares survival rates in victims by evacuation method.

(A)   The table that follows cross-tabulates data for all accidents. Calculate the crude relative risk of death associated with helicopter evacuation.

Table A: All accidents














(B)  Data stratified by the seriousness of the accident are reported. Calculate these strata-specific relative risks.


Table B: Serious accident













Table C: Less serious accidents














(C)   How do you explain the discrepancy between the crude results and strata-specific results?

(D)  Calculate the Mantel-Haenszel adjusted summary relative risk for helicopter evaluation and death while adjusting for the seriousness of the accident.


Key to Odd Numbered Problems                          Key to Even Numbered Problems (may not be posted)