Tree of Strings
This Demonstration enumerates all possible strings, unbounded in both length and size of alphabet used. Click the buttons to move down to the left or right or to move up. The string's index is shown in decimal and binary notation to highlight the connections between the binary expansion of the index, the corresponding string, and its underlying integer composition. The "string list" mode shows the related enumeration of all possible ordered string lists.
The rank or index assigned to any particular string in the universal string enumeration can be thought of as its address in the enumeration tree. Since the enumeration tree is a binary tree, movement from one node to another can be accomplished in three ways: moving down to the left child node, down to the right child node, or up to the parent node. You can also move to a randomly selected point in the tree or reset the visible part of the tree to the root position, showing at the top the first string ("A"), whose index is 1. The tree may also be viewed in index and integer composition modes.
In integer composition mode, the left button appends a 1 to the end of the integer composition, while the right button increments the last element of the integer composition. The tree starts with the single integer composition of 1, and since both of these operations increase the sum of the elements by one, it follows immediately that the nodes on level of the tree are the integer compositions of , and therefore the tree provides a complete enumeration of all integer compositions.
In string mode, the left button adds an "A" to the end of the string and the right button replaces the last character of the string by the next character in the alphabet. Internally the integer compositions are turned into strings with 1 becoming "A", 2 becoming "B", etc. The sum of these character "weights" is the weight of the entire string. Lower-weight strings occur before higher-weight strings in the enumeration.
In index mode it is easily seen that the left and right children of node have indices and , respectively. All modes show the binary representation of the index code of the node at the top of the displayed subtree, starting with a 1 and continuing with precisely the bits (0 or 1) indicated by pushing the "left (0)" and "right (1)" buttons. The bits of the index thus provide a "breadcrumb trail" for the path down the tree. Moving back up the tree corresponds to lopping off the last bit, or dividing the index by 2, discarding the remainder.
An interesting feature of the string enumeration is that the initial 1 in the binary expansion can be thought of as a simple placeholder, a way of distinguishing between different lengths of strings with initial 0s, or as an offset, counting how many strings of lesser weight occur before those of the current weight (plus 1, to allow the iteration to begin with 1). This offset is then added to the reduced code (which includes only the bits selected). The first string in the enumeration consists of the initial 1 followed by no bits.
In the related enumeration of lists of strings (containing all possible strings of all possible lengths and all possible characters, in all possible orders), each node has three branches and the internal reduced code is base-3 (with the ternary digits representing end of string, end of character, and next character), the number of string lists with weight less than , plus 1, is , and this is the offset that must be added to the ternary code for a weight string list to recover the index. As in the enumeration of strings, the first entry in the enumeration has index 1, followed by no ternary digits (which displays as a subscripted 3).
All of these enumerations are 1-1 and onto, so the sets are countably infinite, having the same number of elements as the integers and the set of all rational numbers.