More than 100 years ago it was predicted that the distribution of first digits of real world observations would not be uniform, but instead follow a trend where measurements with lower first digit (1,2,...) occur more frequently than those with higher first digits (...,8,9). This idea was first described by an astronomer, Simon Newcomb in 1881. Newcomb noticed that the pages of logarithm tables were more thumbed for low digits than higher ones. He argued that this was because scientists had more need to look up logs of real numbers with smaller first digit than larger. He produced a mathematical formula predicting the distribution of first digits. The result has long been regarded as a mere mathematical curiosity and largely ignored across the sciences. It was rediscovered in 1938 by an engineer called Benford. Despite a waning of interest the latter name is now associated with the first digit law.
Our new study shows that Benford's first digit rule is a natural phenomenon which is likely to hold universally. We test 15 sets of modern observations drawn from the fields of Physics, Astronomy, Geophysics, Chemistry, Engineering and Mathematics, and show that Benford's law holds for them all. The data sets used in our study consist of more than 750 000 values which vary over 19 orders of magnitude and differ in origin, type and physical dimension. These include the rotation frequencies of pulsars; green-house gas emissions, the masses of exoplanets; as well as numbers of infectious diseases reported to the World Health Organization.
Figure 1 shows predictions of the occurrence frequency of first digits according to Benford's Law together with digit distributions of three of our data sets. A particular focus of our study has been Earth Science observations and here we have shown that the first digit rule applies to the strength as well as timing of reversals of the Earth's geomagnetic field, seismic tomographic models of the Earth's elastic properties and the depth distribution of Earthquakes.
Our results suggest that Benford's Law is a universal feature for data sets with sufficient dynamic range raising the question of how it might be exploited. Use in a forensic mode, e.g. to detect fraud or rounding errors, is possible by simply looking for departures in the frequencies of individual digits. There have been previous applications of this type to detect fraud in financial data. A more intriguing question is whether it can be used to detect signals in contrast to background noise, e.g. in time series data such as seismic signals.
Figure 2 shows an example of how seismic energy from an earthquake follows Benford's law which means that earthquakes can be automatically detected from just the first digit distribution of displacement counts on a seismometer. Our study led to the first ever detection of an anomalous seismic disturbance (assumed to be a small local Canberra earthquake) using first digit information alone.
We have also managed to extend the mathematical description of Benford's law to account for situations where the range of observables is arbitrary. As awareness of this novel phenomenon grows across the natural sciences we expect new applications will appear, one possibility is in checking the realism of computer simulations of complex physical processes, such as in the climate or oceans. If the natural processes are known to possess the first digit property then any computer simulation of that phenomenon should do also. Another is in the detection of rounding errors or other anomalous signals in data. We hope this work will encourage others to look at their digits more closely.