## Problem: Contingency Tables with Sparsely Populated Cells

The usual approach to contingency tables is to apply the X2 statistic to each cell of the table, where the expected value for each cell is calculated by the method described on the previous page. We have reported that: "chi-square is suspect if expected values are less than 5". This may present problems because:

• In designing an experiment it is often impossible to estimate in advance what cell values will be produced.

• While one can say that more total cases and fewer categories should increase cell expected values and thus make the experiment statistically correct, these possibilities may make the experiment too expensive to perform or eliminate the purpose of the experiment.

• Often the experimental protocol is not really modifiable; for example, it may be specified by instructors or duplicated from standard practice.

• The data was collected last week and the report is due tomorrow.

The Design of Experiments is an important topic: as scientists we really shouldn't be in the business of doing experiments that are unlikely to resolve issues. Nevertheless it is not a topic I've taken up on these pages; rather my assumption is in line with the last stated problem: "The data was collected last week and the report is due tomorrow." SO, given data that produces a table which has expected values less than 5, what can be properly said?

#### If the small-expectation cells are rare, maybe you can still use X2

First we note that the above restriction is considered by most statisticians to be too restrictive. Cochran [Biometrics 10 (1954) 417-51] gives the following rule:

[For] contingency tables with more than one degree of freedom: If relatively few expectations are less than 5 (say in 1 cell out of 5, or 2 cells out of 10 or more), a minimum expectation of 1 is allowable in computing X2.
Yarnold [Journal of the American Statistical Association 65 (1970) 864-886] endorses and expands the above rule, suggesting the minimum expectation may even be smaller than 1 if the fraction of small-expectation cells is itself sufficiently small:
If the number of classes [cells] s is three or more, and if r denotes the number of expectations less than five, then the minimum expectation may be as small as 5r/s
In their textbook Probability and Statistical Inference (1996) Hogg and Tanis suggest that while the usual rule is good advice for the beginner, greater leeway can be taken:
The important thing to guard against is allowing some particular [cell expectation] to become so small that the corresponding term in [X2], namely [ (xij-eij)2/eij ] tends to dominate the others because of its small denominator.

Encouraging as the above might seem, very often if you have one small-expectation cell, you have lots of them.

#### Re-bin to fold the small-expectation cells into other cells

In many contingency tables the "outcomes" or "treatments" are related to each other and can be legitimately combined to fold rarely populated cases into similar but more commonly populated cases. For example, the outcomes might be various Likert-scale responses (Strongly Agree, Agree, Neither Agree or Disagree, ...); if "Strongly Agree" is rarely selected (producing many small-expectation cells) it may be included with the "Agree" thus eliminating a row of small-expectation cells. The size of the table is reduced, with the eliminated cell results being combined with other cells. In the example of the previous page a 3×3 table was reduced to 2×2 by "re-binning".

#### Use an exact method

In the exact method, we view the particular contingency table we found by assigning treatments and observing outcomes as embedded in a universe of similar tables that have the same outcome probabilities as our table (i.e., have the same row totals) and the same distribution of treatments (i.e., have the same column totals). In the re-binned example from the previous page we have treatments: A=control (no treatment) and B=intervention including careful removal of clearly affected branches and outcomes: 1=tree death within four years, 2=tree alive after four years. We found this particular 2×2 table:
```     A    B

1:   7   12   19
2:   0    5    5

7   17   24
```
or more simply:
``` 7 12
0  5
```
The universe of similar tables includes just six tables:
``` 7 12   |   6 13   |   5 14   |   4 15   |   3 16   |   2 17
0  5   |   1  4   |   2  3   |   3  2   |   4  1   |   5  0
```
In an exact method you calculate the probability of each table and then sum the probability of our table and every other table even more unusual than our table. If the total probability of such unusual tables is "small" we can reject the null hypothesis that the outcome is independent of the treatment.

Note that the universe of a 2×2 table can be arranged in the linear form shown above so the term: "more unusual than" is well defined without reference to any particular statistical measure of "unusualness" like X2. For our particular table, there is nothing to the left of it, so the universe contains nothing more unusual than our table. If our found table was

``` 3 16
4  1
```
we would need to sum the probability of that table and all tables to the right of it. (Note that we only sum the probabilities for tables to one side or the other of our found table. In this sense our Exact Test is "one-sided".)

The universe of tables grows rapidly with the size of the contingency table. For example, the 3×3 discussed on the previous page:

```5 3 2
2 3 4
0 2 3
```
inhabits a universe with 756 tables. The most likely (p=.025) table in this universe is
```3 3 4
3 3 3
1 2 2
```
followed with six tables each with p=.019
```2 4 4  |  3 3 4  |  3 3 4  |  3 4 3  |  3 4 3  |  4 3 3
3 3 3  |  2 3 4  |  2 4 3  |  2 3 4  |  3 2 4  |  2 3 4
2 1 2  |  2 2 1  |  2 1 2  |  2 1 2  |  1 2 2  |  1 2 2
```
Note that likely tables have, as much as possible, equal cell counts. Unlikely tables have evacuated some cells and maximized others. The six least likely tables have probabilities of (respectively) 5.3 × 10-9, 1.1 × 10-8, 1.4 × 10-8, 4.3 × 10-8, 4.3 × 10-8, and 4.8 × 10-8.
```2 8 0  |  1 0 9  |  7 3 0  |  0 1 9  |  0 1 9  |  2 0 8
0 0 9  |  1 8 0  |  0 0 9  |  2 7 0  |  7 2 0  |  0 8 1
5 0 0  |  5 0 0  |  0 5 0  |  5 0 0  |  0 5 0  |  5 0 0
```
The given table:
```5 3 2
2 3 4
0 2 3
```
turns out to be not particularly unusual. Its p is .0038; 689 of the 756 tables have probabilities equal to or smaller that. The total probability of these "more unusual" tables is .385-- we do not come close to rejecting the null hypothesis. The given table:
```6 3 1
1 4 4
0 1 4
```
has p=.0003. Now 443 of the 756 tables are considered unusual. The total probability of these tables is .034-- which is commonly taken as small enough to reject the null hypothesis. Notice that disturbingly few switches were needed to convert "insignificant" results into "significant" results and that "re-binning" this 3×3 to a 2×2 in the way described above also converts a "significant" result into a "insignificant" result.

In the unlikely event you want to look at all 756 tables and their probabilities, you can click here.

The number of tables in the universe grows very rapidly with the size of the contingency table. The computational problems of enumerating many billions of tables can overwhelm existing computers. This problem was long thought to be a show-stopper, until Mehta and Patel [J. Am. Stat. Assoc. 78 (1983) 427-434] found a clever recursive method of summing the probability in the required tables. Mehta and Patel's methods have been released as F77 code: Algorithm 643 FEXACT in ACM Transactions on Mathematical Software. The version used on this server is due to Clarkson, Fan, and Joe [19 (1993) 484-488].

In the case of 2×2 tables there is an intrinsically defined meaning to the set of tables more unusual than the given table, which allows "one-sided" probabilities to be calculated. There are no well-defined "sides" in more general contingency tables so the below on-line calculator is "two-sided" even if applied to a 2×2 table.