Sequence alignment is widely used in molecular biology to find similar DNA or protein sequences. These algorithms generally fall into two categories: global, which align the entire sequence, and local, which only look for highly similar subsequences. This Demonstration uses the Needleman–Wunsch (global) and Smith–Waterman (local) algorithms to align random English words. Gaps are shaded yellow, mismatches orange, and matches red (with a lighter shade for those matches not appearing in the final alignment).
This Demonstration uses a simple gap penalty, in which each insertion or deletion is scored the same. Because large gaps occur frequently in biological sequences, it is often better to use an affine gap penalty, which uses different values for opening and extending a gap.
For a technical description of the algorithms used, please see the following references:
N. C. Jones and P. A. Pevzner, An Introduction to Bioinformatics Algorithms, Cambridge, MA: The MIT Press, 2004.
I. Korf, M. Yandell, and J. Bedell, BLAST, Sebastopol, CA: O'Reilly & Associates Inc., 2003.