Abstract
We present empirical data on frequency and pattern of misprints in citations to twelve highprofile papers. We find that the distribution of misprints, ranked by frequency of their repetition, follows Zipf’s law. We propose a stochastic model of citation process, which explains these findings, and leads to the conclusion that about 70-90% of scientific citations are copied from the lists of references used in other papers.
Similar content being viewed by others
References
D. De S. Price, Networks of scientific papers, Science, 149 (1965) 510.
J. R. Cole, S. Cole, The Ortega Hypothesis, Science, 178 (1972) 368.
D. De S. Price, A general theory of bibliometric and other cumulative advantage process, Journal of the American Society for Information Science, 27 (1976) 292 (This is a pioneering paper on the subject, but, unfortunately, it contains mathematical inaccuracies starting with Eq. (6). We would recommend Ref. 18 to get familiar with the mathematical treatment of cumulative advantage models).
E. Garfield, Citation Indexing, John Wiley, New York, 1979.
L. Egghe, R. Rousseau, Introduction to Informetrics: Quantitative Methods in Library, Documentation and Information Science, Elsevier, Amsterdam, 1990.
Z. K. Silagadze, Citations and Zipf-Mandelbrot law, Complex Systems, 11 (1997) 487; http://arxiv.org/abd/physics/9901035.
S. Redner, How popular is your paper? An empirical study of citation distribution, Eur. Phys. J. B, 4(1998) 131; http://arxiv.org/abs/cond-mat/9804163.
A. Vazquez, Statistics of citation networks, http://arxiv.org/abs/cond-mat/0105031.
M. V. Simkin, V. P. Roychowdhury, Read before you cite!, http://arxiv.org/abs/cond-mat/0212043; Complex Systems, 14 (2003) 269.
See, for example, the discussion “Scientists Don’t Read the Papers They Cite” on Slashdot: http://science.slashdot.org/article.pl?sid=02/12/14/0115243&mode=thread&tid=134
R. N. Broadus, An investigation of the validity of bibliographic citations, Journal of the American Society for Information Science, 34 (1983) 132.
H. F. Moed, M. Vriens, Possible inaccuracies occurring in citation analysis, Journal of Information Science, 15(1989) 95.
H. L. Hoerman, C. E. Nowicke, Secondary and tertiary citing: A study of referencing behaviour in the literature of citation analyses deriving from the Ortega Hypothesis of Cole and Cole, Library Quarterly, 65 (1995) 415.
E. Garfield, Journal editors awaken to the impact of citation errors. How we control them at ISI, Essays of Information Scientist, 13 (1990) 367.
S. Freud, Zur Psychopathologie des Alltagslebens, (1901).
G. K. Zipf, Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology, Addison-Wesley, Cambridge, MA, 1949.
H. A. Simon, Models of Man, Wiley, New York, 1957.
P. L. Krapivsky, S. Redner, Organization of growing random networks, Phys. Rev. E, 63(2001) 066123; http://arxiv.org/abs/cond-mat/0011094.
P. L. Krapivsky, S. Redner, Finiteness and fluctuations in growing networks, J. Phys. A, 35(2002) 9517; http://arxiv.org/abs/cond-mat/0207107
W. H. Press, B. P. Flannery, S. A. Teukolsky, W. T. Vetterling, Numerical Recipes in FORTRAN: The Art of Scientific Computing, Cambridge University Press, Cambridge, 1992, (see Chapt. 14.3, p.617–620). Also available online: http://lib-www.lanl.gov/numerical/bookfpdf/f14-3.pdf
B. Simboli, http://listserv.nd.edu/cgi-bin/wa?A2=ind0305&L=pamnet&P=R2083
A. Smith, Erroneous error correction, New Library World, 84 (1983) 198.
SPIRES (http://www.slac.stanford.edu/spires/) data, compiled by H. GALIC, and made available by S. REDNER: http://physics.bu.edu/~redner/projects/citation
C. M. Steel, Read before you cite, The Lancet, 348 (1996) 144.
J. Kåhre, The Mathematical Theory of Information, Kluwer, Boston, 2002.
R. K. Merton, The Matthew Effect in science, Science, 159 (1968) 56.
R. Albert, A.-L. Barabási, Statistical mechanics of complex networks, Rev. Mod. Phys., 74(2002) 47.
J. Kleinberg, R. Kumar, P. Raphavan, S. Rajagopalan, A. Tomkins, The Web as a Graph: Measurements, Models and Methods, Lecture Notes in Computer Science, vol. 1627, Springer-Verlag, Berlin, 1999.
S. N. Dorogovtsev, J. F. F. Mendes, Accelerated growth of networks, (see Chapt. 0.6.3) http://arxiv.org/abs/cond-mat/0204102.
A. Vazquez, Knowing a network by walking on it: emergence of scaling, http://arxiv.org/abs/cond-mat/0006132; Europhys. Lett., 54 (2001) 430.
M. V. Simkin, V. P. Roychowdhury, Copied citations create renowned papers?cond-mat/0305150, to appear in Annals of Improbable Research.
R. A. Bentley, H. D. G. Maschner, A growing network of ideas, Fractals, 8 (2000) 227.
M. W. Hahn, R. A. Bentley, Drift as a mechanism for cultural change: an example from baby names, Proc. R. Soc. Lond. B (Suppl.), Biology Letters, DOI 10.1098/rsbl.2003.0045.
S. Turner, D. E. Chubin, Another appraisal of Ortega, the Coles, and science policy: Ecclesiastes hypothesis, Social Science Information, 15 (1976) 657.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Simkin, M., Roychowdhury, V. Stochastic modeling of citation slips. Scientometrics 62, 367–384 (2005). https://doi.org/10.1007/s11192-005-0028-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-005-0028-2