Destruction of the Normal Paradign

Mon Sep 27 15:07:52 EDT 2010

Quentin,

Thank you for the compliment.  It is a relief to receive it from a
statistician of your caliber.  I am not a statistician but a historian
writing a history of statistics.

I want you know that I have your work scheduled for about a half chapter
in the book.  You will be there after the breakthrough made by D. J.
Urquhart and his son and daughter-in-law-John and Norma-who first
developed the compound Poisson or negative binomial as the model for the
library use of scientific journals in a study of usage at the NLST.
I'll cover your presentation to the Royal Statistical Society and
subsequent dispute in J. Doc.  If you interested, I have already written
most of it, and it is posted on the Garfield Web site at the following
URL:

http://www.garfield.library.upenn.edu/bensman/bensman.html

The Urquhart part is covered best in the article entitled Urquhart's
Law, and your work and that of Jean Teague are discussed in  the article
entitled "Scientific and Technical Serials Holdings Observations."   I
did not know about Urquhart when I wrote the latter.  In the book I will
be putting all the material together.  If you want, you can  over what I
wrote about your work, and I will incorporate your suggestions and
objections.  There is no hurry, because I am months away from your
chapter.

Again-thank you for the compliment.  

Stephen J. Bensman

LSU Libraries

Louisiana State University

Baton Rouge, LA   70803

USA

notsjb at lsu.edu

From: ASIS&T Special Interest Group on Metrics
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of Quentin Burrell
Sent: Monday, September 27, 2010 1:10 PM
To: SIGMETRICS at LISTSERV.UTK.EDU
Subject: Re: [SIGMETRICS] Destruction of the Normal Paradign

Stephen

Thanks for this. 

Throughout my teaching career, I always referred to the Normal
distribution - the capital letter signifying that it is a technical term
rather than its common usage interpretation. I just wish that this was
more widely adopted.

I recall my first interview for an academic post and was asked
"Mathematicians believe in the Normal distribution because it is an
established physical fact; Physicists believe in the Normal distribution
because it is an established mathematical theorem. What do you think?"

(I think that the interviewer did give an attribution for the source of
the question but I was too nervous to remember it. Maybe someone can
enlighten me after all these years?)

BW

Quentin

On 27 Sep 2010, at 17:14, Stephen J Bensman wrote:

End of Chapter 5 of my book.  I am really having fun with this things

Stephen J. Bensman

LSU Libraries

Louisiana State University

With his memoirs on skew variation in homogeneous materials Pearson
accomplished the destruction of the normal paradigm.   In his obituary
of Pearson, Yule (1936) declared, "I should count it one of Pearson's
greatest contributions in this field...that he enforced attention to the
extraordinary variety of distributions met with in practice,
illustrating the thesis with example on example and creating in this way
little less than a revolution in the outlook of the ordinary
statistician" (p. 81).   Although Pearson's system of curves is seldom
used today, Eisenhart (1974, p. 451) noted that these curves played an
important role in the development of statistical theory and practice
with the discovery that the sampling distributions of many statistical
test functions appropriate to analyses of small samples from normal,
binomial, and Poisson distributions-such as chi-squared and t-are
represented by particular families of Pearson curves either directly or
through simple transformation.  Moreover, according to Eisenhart, the
fitting of Pearson curves to observational data was extensively
practiced by biologists and social scientists in the decades that
followed these memoirs, and he observed, "The results did much to dispel
the almost religious acceptance of the normal distribution as the
mathematical model of variation of biological, physical, and social
phenomena" (p. 451).

>From the perspective of this book's topic, one of the most the most
interesting, if controversial, examples of this is the work of Cyril
Burt on the distribution of human intelligence.  Burt (Mazumdar, 2004;
Mcloughlin, 2000; Vernon, 2001) was the preeminent British professional
psychologist from 1930 to 1950, being made Knight of the Royal Garter in
1946.  He worked for the London County Council as Britain's first
educational psychologist.  Burt stemmed from the same intellectual
tradition as Galton and Pearson, being a member of the Eugenics Society,
and in 1932 he succeeded Spearman as professor of psychology at
University College London, continuing that institution's Galton-Pearson
statistical tradition and pioneering the integration of biometric
techniques into psychology.  Like Galton, Burt believed that IQ was a
function of nature, not nurture.  In a paper on the distribution of
intelligence Burt (1957) defined intelligence "in the technical sense
given to it, explicitly or implicitly, in the work of Spencer, Galton,
Binet, and their followers, namely, 'the innate general factor
underlying all cognitive activities'" (p. 173), and he hypothesized that
it should follow a moderately asymmetrical distribution.  The reason for
this was that he postulated this type of distribution as a function of
two possible genetic modes of inheritance: 1) in certain cases, the
deviation studied may act as a recognizable trait dependent on a single,
major gene; and 2) in other cases, it is apparently determined by the
joint action of a large number of genes.   He summarized the effect of
these two interacting genetic modes in the following manner:

   ...The 'major genes' seem comparatively rare, but each will produce 

               effects that are large and...for the most part
detrimental; the 'polygenes'

               must be much more numerous, but their effects will be too
small to be identified

               individually.  With this double assumption, the resulting
distribution would

               take the form, not of the normal curve, but of an
asymmetrical, bell-shaped

               curve of unlimited range in either direction.  p. 166.

Burt identified this curve with the Pearson Type IV, citing Pearson's
first memoir on skew variation that this was the prevailing in
zoological and anthropological material.  In this memoir Pearson (1895)
described the Type IV as having "Unlimited range in both directions and
skewness" (p. 360) and speculated that the reason for its prevalence in
zoological measures was due to the "inter-dependence of the
'contributory' causes" (p. 412).  Burt (1957) found that empirical
evidence derived from intelligence tests produced statistical constants
implying curves-slightly leptokurtic and negatively
asymmetric-consistent with his twofold genetic hypothesis.  In a
follow-up paper Burt (1963) tested frequency distributions obtained from
applying IQ tests to large samples of the school population and found
that the distributions actually observed were more asymmetrical with
longer tails than predicted by the normal curve.  The best fit to the
data was the Pearson Type IV.  According to him, the assumption of
normality led to a gross underestimate of the number highly gifted
individuals in England and Wales-31.7 persons with IQs above 160
predicted by the normal curve as against the 342.3 such persons
predicted by the Type IV.  

            After his death in 1971 Burt came under assault for shoddy
research methods, falsification of data, and supposedly fictitious
research assistants.  As a result of these attacks, the British
Psychological Society Council found that Burt was a "scientific fraud"
in 1979.  The assault produced a reaction, whereby the original
assaulters themselves came under assault.  Mazumdar (2004, p. 6) points
out that it is difficult to separate the question of Burt's science from
politics.  Burt was formed in an era when hereditarianism and eugenics
were the norm, and in the egalitarian atmosphere of post-war Britain
such views were considered antiquated and unjust.   The assault on Burt
was led largely by psychologists, who were passionate environmentalists.
As a result of the battle, Burt was partially rehabilitated.  In 1992
the British Psychological Society Council (1992) resolved that no
universally accepted agreement was possible on this matter, declaring,
"The British Psychological Society no longer has a corporate view on the
truth of the allegations concerning Burt" (p. 147).  In a book of
readings on the measurement of intelligence Eysenck (1973)-himself a
highly influential but controversial British psychologist of German
descent-included Burt's 1963 article proving that the Pearson Type IV
best fitted the distribution of human intelligence, stating that Burt's
view on the applicability of the Pearson Type IV to the distribution of
IQ is "probably correct" (p. 37).  Evaluating Bert's findings, he stated
that the normal and Type IV curves are not very dissimilar, but, as Burt
pointed out, there are marked differences at the extremes.  Referring to
these marked differences at the extremes, Eysenck, stated that "from the
social point of view these may be very important indeed" (p. 37).  For
example, at upper IQ extreme they do increase the probability of persons
capable of doing high level science in the population.  Eysenck's book
has the following dedication:  "To the Memory of Cyril Burt, who taught
me."    

Pearson did not entirely dethrone the normal distribution, which still
plays a central role in statistical theory.  Snedecor and Cochran (1989,
p. 40) list four reasons for this.  First, the distributions of many
variables such as heights of people, the lengths of ears of corn, and
many linear dimensions of manufactured articles are approximately
normal.  These authors state that in fact any variable whose expression
results from the additive contributions of many small effects will tend
to be normally distributed.  The second reason listed by Snedecor and
Cochran is that for measurements whose distributions are not normal, a
simple transformation of the scale of measurement may induce approximate
normality.  Two such transformations-the square root and the
logarithmic-are indicated by them as being often employed.  According to
Elliott (1977, p. 33), the Poisson is made to approximate normality by
the square root transformation, whereas most distributions in
scientometrics and information science require some form of the
logarithmic transformation, converting them into the lognormal
distribution.  The third reason listed by Snedecor and Cochran is the
normal distribution is relatively easy work with mathematically, and
their fourth reason is that even if the distribution in the original
population is far from normal, the distribution of sample means tends to
become normal under random sampling as the size of the sample increases.
This contradiction between the importance of the normal distribution in
statistical theory and its relative infrequency in reality creates a
tension, which caused Geary (1947) to emphasize the importance of
testing for normality and to recommend that the following warning be
printed in bold type in every statistical textbook: "Normality is a
myth; there never was, and never will be, a normal distribution" (p.
241).  The tension between importance and infrequency caused George Box
(1976), R.A. Fisher's son-in-law, to compare the role of the normal
distribution in statistics to the general role of the mathematical model
in science as a whole thus:

In applying mathematics to subjects such as physics or statistics we

make tentative assumptions about the real world which we know are

false but which we believe may be useful nonetheless.  The physicist

knows that particles have mass and yet certain results, approximating

what really happens, may be derived from the assumption that they

do not.  Equally, the statistician knows, for example, that in nature

there never was a normal distribution, there never was a straight line, 

yet with normal and linear assumptions, known to be false, he can

often derive results which match, to a useful approximation, those

found in the real world.  p. 792.  

What Pearson accomplished can be easily deduced from that above.  By
proving that most reality is not random and additive but causal and
multiplicative, he converted the normal distribution from a universal
descriptor of reality into a mathematical, mental construct for the
distribution of error, against which to test reality.  Given that much
of reality is multiplicative, whereas error is additive-and the
logarithmic transformation converts data from multiplicative to
additive-the Galton-McAlister law of the geometric mean, which Pearson
rejected as a descriptor of reality due to still being based upon
Gaussian axioms, now has an important role as a law of error in
statistical tests of significance.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20100927/3bb6ff2b/attachment.html>