Random Character Sequences Do Not Follow Zipf's Law

Requires a Wolfram Notebook System
Interact on desktop, mobile and cloud with the free Wolfram Player or other Wolfram Language products.
This Demonstration shows that word frequencies [1] in random character sequences and real texts behave differently from the point of view of Zipf's law. (For random character sequences, a word means the smallest unit separated by blanks.) Data exhibiting Zipf‐like behavior shows a roughly linear relationship between frequency and rank on a log‐log plot.
[more]
Contributed by: Osman Tuna Gökgöz (May 2010)
Suggested by: Ramon Ferrer i Cancho
Open content licensed under CC BY-NC-SA
Snapshots
Details
[1] R. Ferrer–i–Cancho and B. Elvevåg, "Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution," PLoS ONE.
[2] W. Li, "Random Texts Exhibit Zipf's-Law-Like Word Frequency Distribution," IEEE Transactions on Information Theory, 38(6), 1992 pp. 1842–1845.
[3] Gabriel Altmann, Comments from Quantitative Linguistics.
Permanent Citation
"Random Character Sequences Do Not Follow Zipf's Law"
http://demonstrations.wolfram.com/RandomCharacterSequencesDoNotFollowZipfsLaw/
Wolfram Demonstrations Project
Published: May 20 2010