Binary Outcome, Matched-Pairs

Odds Ratio | Binomial Test | McNemar Test | Multiple Controls Per Case

Odds Ratio

Suppose you want to conduct a case-control study of disease D and exposure E in which cases and controls are closely matched on potential confounders. Although we might be tempted to use routine case-control procedures to analyze such data, a procedure that accounts for the sample-match is required. Matched pairs are classified as follows:

Control pair-member is Exposed Control pair-member is Unexposed
Case pair-member is Exposed a b
Case pair-member is Unexposed c d

In this table, cells a and d contain counts of the concordant pairs (case-control pairs-members are same with respect to exposure status) and cells b and c contain counts of discordant pairs (case-control pairs-members are different with respect to exposure status). The odds ratio estimate is the ratio of case-positive to case-negative matched pairs:

or = b / c

Illustrative example. Suppose that in studying 50 matched-pairs (100 people total) we find: 

Control pair-member is E+ Control pair-member is E-
Case pair-member is E+ 5 30
Case pair-member is E- 10 10

Therefore, or = 30 / 10 = 3.

Data processing. There are several ways to set up data files for this type of problem. One method is to us set up a data file so that each record contains data for a matched-pair with separate variables for the exposure status of cases (CASE: 1,0) and the exposure status of controls (CONTROL: 1,0). Data for this illustrative data shown above are stored in MATCH-CC.ZIP. The first two and last record in this dat set are:

---  --------- ---------
  1        1         1
  2        1         1
 55        0         0

The following program is used to tally counts:


Output for the illustrative example is:

DELTA | Freq  Percent   Cum.
a     |  5    9.1%     9.1%
b     |  30   54.5%    63.6%
c     |  10   18.2%    81.8%
d     |  10   18.2%   100.0%
Total |  55  100.0%

Binomial Test

If there were no association between the exposure and disease, deviations from an equal number of discordant types would be random. To see this, let n' represent the number of discordant pairs (n' = b + c). Let p represent the proportion of positive discordancies (i.e., discordancies with exposed cases) observed in the sample.

p = b / n'

For the illustrative example, p = 30 / 40 = .75.

Let P represent the expected value of p. If there was no association between the exposure and disease, P would be equal to .5. Therefore, the one-sided null and alternatives looking for a positive association would be: H0: P <= .5 vs. H1: P > .5 and an exact binomial test would quantify the probability of seeing at least b positive discordancies given n' trials assuming P = .5. This exact probability can be computed with EpiTable | Probability | Binomial: Proportion vs. Std.. Value for b, n' are entered as the numerator and denominator, respectively, while the expected percentage under the null (P0) is equal to 50%.

Input for the illustrative data example is:

      Numerator                   30
      Total observations          40
      Expected percentage (%)     50

Expected percentage (%)       :     50.00
Observed percentage (%)       :     75.00

Output is:

Probability that the # of case
  < 30 = 0.9988892
 <= 30 = 0.9996602
  = 30 = 0.0007709
 >= 30 = 0.0011107
  > 30 = 0.0003397

Two-tailed p-value: 0.00145048

The one-sided p-value has been highlighted. The two-sided p-value for the test is also shown. [The two-sided null and alternative are: H0: P = .5 vs. H1: P not = .5.]

McNemar's Test

McNemar's test is a normal approximation to the binomial test. You can calculate McNemar's test by clicking EpiTable > Study > Case-control > Matched 1:1. Data are input in the form of a discordancy/concordancy table. Output is:

      Ctrl+   Ctrl-
Cas+| 5      |  30    |
Cas-| 10     |  5     |

McNemar corrected  Chi square                9.03
P   value                              0.00266306
Odds ratio                     3.00, [1.41, 6.55]

Comments: (1) McNemar's test is a The a normal approximation to the binomial with correction for discontinuity. If the null hypothesis were true, the test statistic would vary randomly as a chi-square random variable with 1 degree of freedom. The probability of observing a chi-square random variable with one degree of freedom that is greater than or equal to 9.025 is equal to .0027. (2) EpiTable calculates the odds ratio. in the standard way: or = 3.0 (95% confidence interval: 1.4, 6.6).

Multiple Controls Per Case

The investigator may match multiple controls per case. This can be considered an extension of the more familiar 1:1 matching situation. To introduce this idea, let us consider an a 4:1 matched case-control data in which data are often `displayed as follows:

# Exposed Controls
4 of 4 3 of 4 2 of 4 1 of 4 0 of 4
Case is E+ a b3/4 b2/4 b1/4 b0/4
Case is E- c4/4 c3/4 c2/4 c1/4 d

In the above table, cells a and d represent concordant sets and cells bx/y represent discordant sets with x of y exposed controls. Each case is compared to its own controls, almost as if 4 separate pairs were analyzed.

As was the case with matched pairs, concordant pairs fail to contribute to risk estimates and discordant pairs contribute in proportion to their numbers. For example, cell b3/4 contributes 3 exposed controls and only 1 exposed control. In contrast, cell b0/4 has 4 positive-case discordancies.

The odds ratio estimator is:

     no. of non-exposed controls matched with an exposed case
or = ---------------------------------------------------------
     no. of exposed controls matched with a non-exposed cases

Illustrative example. Let us consider data from a consultation by Dicker (1999) in which rotavirus infection is related to intestinal intussusception. Data are stored in DICKER.ZIP. Epi Info can handle this analysis if the REC file is set up to have variables for case or control status (e.g., CASE: Y/N), an indicator for the matched set (e.g., MATCHVAR: 1, 2, . . . ), the exposure status of the subject (EXPOSURE: Y/N). The data file would look something like this:
1 Y 1 Y
2 N 1 N
40 N 10 N

To compute statistics, issue the command:


Output for this illustrative data example is:

                          10 Matched Sets
                              N =    10
         4 controls per case
       Number of exposed controls
          4     3     2     1     0
C     +-----------------------------+
A Exp+|    0|    0|    0|    0|    1|
S     +-----+-----+-----+-----+-----|
E Exp-|    0|    0|    0|    1|    8|
S     +-----------------------------+

Data show four positive discordancies (one exposed case matched with four non-exposed controls) and one negative discordance (one non-exposed case matched with an exposed control). Therefore, or = 4 / 1 = 4.0. Thus, the entire analysis hinges on a single exposed case and four non-exposed controls.

Inferential statistics are reported in a summary section:

Mantel-Haenszel Summary Chi Square                                      0.03
       (Equivalent to McNemar Chi-Square, corrected)
P value                                                           0.85968380
Mantel-Haenszel Summary Chi Square (uncorrected)                        1.13
P value                                                           0.28884437

                     SUMMARY ODDS RATIO
Crude OR                                                                7.00
Mantel-Haenszel matched Odds ratio                                      4.00
Maximum likelihood estimate of OR (MLE)                                 4.00
Exact 95% confidence limits for MLE                       0.05 < OR < 313.99
Exact 95% Mid-P limits for MLE                            0.10 < OR < 156.00
Probability of MLE >=  4.00 if population OR = 1.0                0.36000000

Comments: (1) In this output, the crude odds ratio estimate should be ignored as irrelevant. (2) The Mantel-Haenszel statistics and MLE statistics are valid. (Because of the small sample size (n' = no. of discordancies = 5), the MLE statistics are preferred with these data. Thus, or = 4.0 (95% confidence interval for OR: 0.1, 156), p = .36. (4) Identical statistics can be computed with the TABLES EXPOSED CASE MATCHVAR command. The reason that both command works is that the matched analysis is the similar to Mantel-Haenszel analysis with each matched tuple serving as its own stratum. In words, the intussusception data could just as easily be analyzed as 10 strata (each with one case and four controls) using the standard Mantel-Haenszel procedures.


Dicker, R. C. (1999, December 22). Consultant's Corner - Matched Odds Ratio Formula Corrected. Unpublished.

Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions, 2nd ed.. New York: Wiley.

Rosner, B. (1995). Fundamentals of Biostatistics (4th ed.). Belmont, CA: Duxbury Press.