Random Character Sequences Do Not Follow Zipf's Law
Requires a Wolfram Notebook System
Interact on desktop, mobile and cloud with the free Wolfram Player or other Wolfram Language products.
This Demonstration shows that word frequencies  in random character sequences and real texts behave differently from the point of view of Zipf's law. (For random character sequences, a word means the smallest unit separated by blanks.) Data exhibiting Zipf‐like behavior shows a roughly linear relationship between frequency and rank on a log‐log plot.[more]
We consider only one random sequence model. All characters, including the blank or space are equally likely. This model is specified with a single parameter, , the number of characters other than the space. was used in  (as mentioned in ). In this Demonstration, you can select between 2 and 26.[less]
Contributed by: Osman Tuna Gökgöz (May 2010)
Suggested by: Ramon Ferrer i Cancho
Open content licensed under CC BY-NC-SA
 R. Ferrer–i–Cancho and B. Elvevåg, "Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution," PLoS ONE.
 W. Li, "Random Texts Exhibit Zipf's-Law-Like Word Frequency Distribution," IEEE Transactions on Information Theory, 38(6), 1992 pp. 1842–1845.
 Gabriel Altmann, Comments from Quantitative Linguistics.
"Random Character Sequences Do Not Follow Zipf's Law"
Wolfram Demonstrations Project
Published: May 20 2010