Overview
String Distance is a package that offers two algorithms to calculate string distances: Levenshtein and Damerau-Levenshtein. The distance between two strings is the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, substitution of a single character, or a transposition of two adjacent characters.
String Distance is free software and uses the same license as Lua 5.1.
Algorithms
- Levenshtein distance
- This algorithm returns the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character.
- Damerau-Levenshtein distance
- The Damerau-Levenshtein distance is based on the Levenstein algorithm. In addiction of the three basic edit operations (insertion, deletion and substituition) this algorith includes also the operation of transposing two adjacent characters.
Status
Current version is 1.1.0. It was developed to work for Lua 5.0, 5.1 and 5.2.
Download
String Distance can be downloaded from its own download page.
Compiling
String Distance is distributed as a C source file: stringdistance.c. This library can be linked to the application or dynamically loaded. The initialization function is luaopen_stringdistance and it is a Lua open-library compatible function.
Installation
String Distance is C module which must be compiled and installed in your LUA_CPATH. LuaRocks can be used to install it since String Distance is also distributed as a rock.
Reference
String Distance module provides two functions, which implements the corresponding algorithms:
lev(str1, str2)
- Calculates the distance between the given strings (arguments
str1
andstr2
) according to Levenshtein's algorithm. The function returns the distance (a number) between the two strings.
dam(str1, str2)
- Calculates the distance between the given strings (arguments
str1
andstr2
) according to Damerau-Levenshtein's algorithm. The function returns the distance (a number) between the two strings considering transpositions.
History
- [21/12/2011] Version 1.1 released
Reimplementation to add compatibility to all Lua 5 versions - [19/01/2011] Version 1.0 released
Credits
String Distance was implemented by Tomás Guisasola, Marcelle Mota and Pablo Musa.
String Distance was developed for PUC-Rio which holds its copyright.
References
The algorithms are implemented based on pseudocode available at Wikipedia.
The original paper about the algorithm is the following
Fred J. Damerau. 1964. A technique for computer detection and correction of spelling errors. Commun. ACM 7, 3 (March 1964), 171-176.
Contact us
For more information please contact us. Comments are welcome!