Levenstein Distance

The Levenshtein distance between two strings is the minimum number of operations of the form:

  • Substituting a character,
  • Deleting a character, or
  • Inserting a character,

which transforms one string into the other. The function, by definition, is symmetric and distance(s, t) ≤ max(s.length(), t.length()) the longer string t can be transformed into the shorter string s by making s.length() switches and t.length() - s.length() deletions. Such a distance is said to be an edit distance: it gives the minimum number of edits necessary to change one string into the other.

The algorithm for computing the Levenshtein distance is a dynamic algorithm which runs in Θ(s.length()·t.length()) time and nominally the same amount of memory; however, an implementation using Θ(min(s.length(), t.length())) memory is also given.

References

  • wikipedia
  • V. I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, Soviet Physics Doklady 10, 1966, pp.707-710.