Time versus Size in Compressing Text

Compare different types of compression on different documents. You can choose the original version, one-word replacement, or two-word replacement to see the time and digital space trade-off between a simple and more complex compression. One-word replacement takes frequently appearing words and replaces them with a code word, while two-word replacement takes frequently appearing two-word phrases and replaces them with a code word.


  • [Snapshot]
  • [Snapshot]
  • [Snapshot]


The algorithm searches for the most efficient words and phrases to compress within the document, using a variation of Huffman coding to do so. Instead of using binary code to represent frequently appearing words, a list of predetermined code words consisting of infrequently appearing symbols are used.
At times, a part of a word is replaced because it contains the frequently appearing word (such as the "and" in "hand"). This is considered a bonus in the compression because it saves more digital space without the need for more code words.
    • Share:

Embed Interactive Demonstration New!

Just copy and paste this snippet of JavaScript code into your website or blog to put the live Demonstration on your site. More details »

Files require Wolfram CDF Player or Mathematica.