- In designing an experiment it is often impossible to estimate in advance what cell values will be produced.
- While one can say that more total cases and fewer categories should increase cell expected values and thus make the experiment statistically correct, these possibilities may make the experiment too expensive to perform or eliminate the purpose of the experiment.
- Often the experimental protocol is not really modifiable; for example, it may be specified by instructors or duplicated from standard practice.
- The data was collected last week and the report is due tomorrow.

The *Design of Experiments* is an important topic:
as scientists we really shouldn't be in the business of doing
experiments that are unlikely to resolve issues. Nevertheless
it is not a topic I've taken up on these pages; rather
my assumption is in line with the last stated
problem: "The data was collected last week and the report is due tomorrow."
SO, given data that produces a table which has expected values less than 5, what can be
properly said?

First we note that the above restriction is considered by
most statisticians to be too restrictive. Cochran [*Biometrics*
10 (1954) 417-51] gives the following rule:

[For] contingency tables with more than one degree of freedom: If relatively few expectations are less than 5 (say in 1 cell out of 5, or 2 cells out of 10 or more), a minimum expectation of 1 is allowable in computingYarnold [X^{2}.

If the number of classes [cells]In their textbooksis three or more, and ifrdenotes the number of expectations less than five, then the minimum expectation may be as small as 5r/s

The important thing to guard against is allowing some particular [cell expectation] to become so small that the corresponding term in [X^{2}], namely [ (x_{ij}-e_{ij})^{2}/e_{ij}] tends to dominate the others because of its small denominator.

Encouraging as the above might seem, very often if you have one small-expectation cell, you have lots of them.

A B 1: 7 12 19 2: 0 5 5 7 17 24or more simply:

7 12 0 5The universe of similar tables includes just six tables:

7 12 | 6 13 | 5 14 | 4 15 | 3 16 | 2 17 0 5 | 1 4 | 2 3 | 3 2 | 4 1 | 5 0In an exact method you calculate the probability of each table and then sum the probability of our table and every other table even more unusual than our table. If the total probability of such unusual tables is "small" we can reject the null hypothesis that the outcome is independent of the treatment.

Note that the universe
of a 2×2 table can be arranged in the linear form shown above
so the term: "more unusual than" is well defined without
reference to any particular statistical measure of "unusualness" like *X*^{2}.
For our particular table, there is nothing to the left of it, so the
universe contains nothing more unusual than our table. If our found table was

3 16 4 1we would need to sum the probability of that table and all tables to the right of it. (Note that we only sum the probabilities for tables to one side or the other of our found table. In this sense our Exact Test is "one-sided".)

The universe of tables grows rapidly with the size of the contingency table. For example, the 3×3 discussed on the previous page:

5 3 2 2 3 4 0 2 3inhabits a universe with 756 tables. The most likely (

3 3 4 3 3 3 1 2 2followed with six tables each with

2 4 4 | 3 3 4 | 3 3 4 | 3 4 3 | 3 4 3 | 4 3 3 3 3 3 | 2 3 4 | 2 4 3 | 2 3 4 | 3 2 4 | 2 3 4 2 1 2 | 2 2 1 | 2 1 2 | 2 1 2 | 1 2 2 | 1 2 2Note that likely tables have, as much as possible, equal cell counts. Unlikely tables have evacuated some cells and maximized others. The six least likely tables have probabilities of (respectively) 5.3 × 10

2 8 0 | 1 0 9 | 7 3 0 | 0 1 9 | 0 1 9 | 2 0 8 0 0 9 | 1 8 0 | 0 0 9 | 2 7 0 | 7 2 0 | 0 8 1 5 0 0 | 5 0 0 | 0 5 0 | 5 0 0 | 0 5 0 | 5 0 0The given table:

5 3 2 2 3 4 0 2 3turns out to be not particularly unusual. Its

6 3 1 1 4 4 0 1 4has

In the unlikely event you want to look at all 756 tables and their probabilities, you can click here.

The number of tables in the
universe grows very rapidly with the size of
the contingency table. The computational problems of
enumerating many billions of tables can overwhelm existing
computers. This problem was long thought to be
a show-stopper, until Mehta and Patel [*J. Am. Stat. Assoc.*
78 (1983) 427-434] found a clever recursive method of
summing the probability in the required tables. Mehta and Patel's
methods have been released as F77 code: Algorithm 643 `FEXACT`
in
*ACM Transactions on Mathematical Software*. The version
used on this server
is due to Clarkson, Fan, and Joe [19 (1993) 484-488].

In the case of 2×2 tables there is an intrinsically defined meaning to the set of tables more unusual than the given table, which allows "one-sided" probabilities to be calculated. There are no well-defined "sides" in more general contingency tables so the below on-line calculator is "two-sided" even if applied to a 2×2 table.

For additional mathematical details of the exact test, click here

click here for exact, two-sided analysis of up to 6×6 contingency tables.