Ushered in by Tycho Brache (1546-1601), the era of accurate astrometry (measurement of positions of things in the sky) was well underway by the 1700s. In 1740 the astronomer Jacques Cassini compiled the below list of observations of the tilt of the Earth's equator compared to the plane of the Earth's orbit about the Sun ("obliquity of ecliptic"). The list seems to demonstrate that the obliquity slowly has decreased from almost 24° to about 23½° during the nearly 2,000 years covered.

Date Obliquity 140 B.C. 23.853 ° 140 23.856 390 A.D. 23.500 880 23.583 1070 23.567 1300 23.533 1460 23.500 1500 23.473 1500 23.488 1570 23.499 1570 23.525 1600 23.517 1656 23.484 1672 23.482 1738 23.472Data like this presented a problem for the scientists of the day.

On one hand, given the well known difficulties of accurate measurement, it came as no surprise that different astronomers found slightly different results for the same quantity. For example, Cassini's list has two obliquity measurements for 1570: Tycho's and Danti's; they differ by .026°.

On the other hand, mathematically working with inexact numbers
is a problem. It is well known that given two distinct data points:
(*x _{i}*,

Clearly it is impossible (or at best a waste of effort) to find mathematical equations that exactly reproduce inexact results. Thus folks tried to find a "good" equations that came "close" to the necessarily inexact measurements (much as an average value comes "close" to as much as possible of the data). A solution was published in 1805 by Adrien Marie Legendre in his book on determining the orbits of comets. He correctly noted that there is arbitrariness in the choice of the best equation, but proposed a solution:

Of all the principles that can be proposed for this purpose, I think there is none more general, more exact, or easier to apply, than that which we have used in this work; it consists of making the sum of the squares of the errors [deviation of measurement from equation] aThus when we can't makeminimum. By this method, a kind of equilibrium is established among the errors which, since it prevents the extremes from dominating, is appropriate for revealing the state of the system which most nearly approaches the truth

"error"=*y _{i}*-(

zero for all the data, we compromise so that some "errors" are positive, others negative, but the sum of the squares of the "errors" for all the data pairs is as small as possible.

In his 1809 book on planetary orbits Carl Friedrich Gauss published the method of least squares and in addition provided a hint as to why least-squares is preferred if errors are distributed "normally" (i.e., with a Gaussian distribution). (He also reported that he had been using least squares since 1795, thus initiating a priority dispute with Legendre.)

By 1810, another great mathematician/physicist Pierre Simon Laplace had endorsed the least squares method, and within the next decade this simple method became the standard method of analyzing data in astronomy. (Interestingly it took nearly a century for the technique to find wide-spread use in biology and the social sciences.)

While the method of least squares is simple and in many cases the proper
method to apply, it is by no means the only proper method to fit data to lines.
For example, why focus on minimizing the *square* of the "error"?
Instead, one might minimize the absolute value of the "error".
The method of least square puts *x* and *y* on decidedly
unequal footings: "error" means vertical (*y*) deviation rather
than horizontal deviation (*x*). This is appropriate if the *x*
is exactly set while the *y* have measurement error. However in the
Casinni data the *y* measurement error seems to be about 0.01°
whereas the date (*x*) of the early measurements is uncertain by decades.
Which is bigger a 10 years or 0.01°?
(If you provided an answer to my rhetorical question, would your answer
change if I made the comparison to 1 decade and 36 arc-secs?)
If we don't have the problem of
different units for *x* and *y*, one reasonable compromise would
be to count as "error" the shortest distance from the data point to the line.

Thus one might distinguish a long list of possible fitting methods:

- "ordinary" least squares (where "error" is just the
*y*deviation) - least squares, but where "error" is just the
*x*deviation - the "average" of the above two: the line that bisects the above two lines
(this only makes sense if
*x*and*y*have the same units) - least squares, but where "error" is the shortest distance between the point and
the line (this only makes sense if
*x*and*y*have the same units) - "ordinary" least absolute deviation (where we minimize the sum
of the absolute value of the "error", i.e., the
*y*deviation) - least absolute deviation where "error" is just the
*x*deviation - the "average" of the above two

click here to do all the above fits and display four trendlines

Here are PostScript, PDF, and text files showing the results of these fits to Cassini's data. Our current understanding of how obliquity changed (along with Cassini's data) is displayed here...upshot: the slope is shallower than all of the fits suggest, but essentially within errors of OLS(y).

In more recent history (ApJ **364** p.104 1990 & **397** p.55 1992),
statisticians and astronomers have provided detailed guidance as the which method to
choose, which I much over simplify below.

- if "most" of the "error" is in one quantity, put that quantity on the
*y*-axis and use ordinary least squares - if the two quantities seem equally (or unknowably) prone to "error" and have the same units use 3
- if the methods deviate "significantly" understand why
- if possible know your measurement errors

In discussing orbit calculations for the comet of 1680, Newton caught the gist of the truth. He wrote:

"From all this it is plain that these observations agree with theory, so far as they agree with one another."If different folks (correctly) measure the same quantity and get different results, the (standard) deviation of their results tells us the accuracy of the measurement. If theory lies within a few standard deviations of all the measurements, we must count that as a confirmation of theory.

Now we come to the actual use Cassini made of his data. He argued (1) from
his own experience measuring obliquity (in 1656) and (2) from the near
consistency of measurements made at nearly the same time, that the measurement
errors for obliquity were in fact much smaller than the deviation between the
points and the line. Thus, while the line has a very small *P*, it was missing
the data points by unbelievably large amounts: many times the measurement errors.
Hence, he argued, the obliquity is not decreasing uniformly.

The main point here is that without some idea of measurement errors, we cannot
say whether or not the line comes adequately close to the data points.
A small *P* is no guarantee that a least square line represents the actual
behavior.

Unfortunately the best way to know your measurement errors is to repeat the experiment 10 times and take the standard deviation of the resulting measurements. Thus experiments that properly tell the uncertainty of the results are 10 times harder than those that don't. Hence the old statement that physicists are judged more for their uncertainties than for their values.

Equipment manufacturers specifications often report expected errors of measurements; these calibration errors are almost never relevant to the statistical measurement errors discussed here. Sorry: no shortcuts in real life. Elementary lab periods rarely have the time to allow 10 repeats, so they often use extremely crude estimates of measurement errors (like equipment manufacturers specifications). Sorry: follow the methods your physics instructor uses when he actually does physics, not what he preaches in introductory lab.

SO if we really want to get a meaningful *P*, in addition to our
(*x _{i}*,

Click here for more on this sort of fit

Finally, if we really want to do a good job, we should recognize that there are
errors in both *x* and *y*.