Levenshtein Distance
The Levenshtein distance between two strings
a
and b
can be computed using the following algorithm:
(define (distance a b)
(define (d i j)
(cond ((= i 0)
j)
((= j 0)
i)
((char=? (string-ref a (- i 1))
(string-ref b (- j 1)))
(min (+ 1 (d (- i 1) j))
(+ 1 (d i (- j 1)))
(d (- i 1) (- j 1))))
(else
(min (+ 1 (d (- i 1) j))
(+ 1 (d i (- j 1)))
(+ 1 (d (- i 1) (- j 1)))))))
(d (string-length a) (string-length b)))
If the last characters of those two prefixes are the same, then the distance between the prefixes might be the distance between the prefixes without their last characters, which is
(d (- i 1) (- j 1))
On the other hand, the shortest sequence of edit operations might involve deleting the last character of one of the prefixes, so we need to try those possibilities as well. The distance between the two prefixes is therefore the minimum of the distances between the prefixes we get by exploring all three possibilities.
So long as both of the prefixes are non-empty, we don't need to consider using an insertion operation because inserting at the end of one prefix to make its last character match the last character of the other prefix is equivalent to deleting the last character of the other prefix.