ABS: Rousseau,LOTKA: A program to fit a power lawdistribution to observedfrequency data

Mark Newman mark at SANTAFE.EDU
Fri Jan 26 15:31:24 EST 2001

> Eric Archambault wrote:
> My contribution aimed to inform the people on the list on an additional way to calculate power law
> distributions. I believe that Rousseau's contribution is useful since it uses a maximum likelihood
> approach. It would be interesting to compare the extent of the difference between this method
> compared to using least-square fits.

As has been pointed out by many people before (though perhaps
not on this list), performing least-squares fits to data in
order to fit a power law is fraught with danger.  The principal
objection to this method is that, with logarithmic fits, the
statistical fluctuations in the logarithms of the data are
greater in the downward direction than in the upward one, for
obvious reasons.  This effect is more pronounced in the tail
of the power law, and this has the result that there is a
systematic tendency for least-squares fits to overestimate the
slope of the power law.  How much they overestimate depends on
the size of the statistical fluctuations, and is therefore
rather hard to control for.  For this reason simple least-squares
fits are to be avoided.

Two common methods are used to circumvent this problem, neither
of which is perfect: (1) One calculates a backward cumulated
histogram of one's data (also called a rank/frequency plot).
This much improves the statistical fluctuations, but has the
undesirable property that successive data points become
correlated, making the simple statistical estimate of error
on the fit invalid.  (2) One performs logarithmic binning of
the data, i.e., binning where the widths of adjacent bins are
a constant ratio, and normalizes by bin width.  This reduces
the effects of the fluctuations, but for power laws with slope
greater than -1 it does not eliminate them altogether.  (This
latter is my favored method.)

Ronald Rousseau proposes a further method based on maximization
of likelihood.  This is also a good method to use, but is also
not perfect since, like all maximum likelihood methods, it
implicitly assumes that the probability of the model given the
data is equal to the probability of the data given the model,
which is only strictly true if the prior probabilities of both
model and data are uniform, which in general they are not.

The ultimate correct way of doing it is to use maximum entropy,
given the correct prior on the model.  The trouble is, we
rarely know what the correct prior is, which is why maximum
likelihood is popular.

Mark Newman.

Prof. M. E. J. Newman
Santa Fe Institute
Santa Fe, New Mexico

More information about the SIGMETRICS mailing list