Help on class Approx in Apse: Apse.Approx = class Approx | Python extension for approximate matching (fuzzy matching) | | Python port of the String::Approx module written for perl | by Jarkko Hietaniemi. | | APSE.Approx lets you match strings approximately. With this you can emulate | errors: typing errorrs, speling errors, closely related vocabularies | (colour color), genetic mutations (GAG ACT), abbreviations (McScot, MacScot). | | The measure of approximateness is the Levenshtein edit distance. It is the total | number of edits defined as: | | insertions, word -> world | deletions, monkey -> money | substitutions, sun -> fun | | required to transform a string to another string. For example, to | transform lead into gold, you need three edits: | lead -> gead -> goad -> gold | The edit distance of lead and gold is therefore three. | | Typical usage: | | from Apse import Approx | | # match entries at most one character away | apx = Approx("python", edit=1) | | # match strings | apx.match("jython") # will return the word "jython" | apx.match("jjthon") # will return None | | # match lists | test_words = ["jython", "jjthon"] | apx.match(test_words) # returns [ "jython" ] | | # edit distances | apx.dist("jython") # will return 1 | apx.dist("jjthon") # will return 2 | | Note: The definition of the goodness of an approximate match is the | number of steps required to bring the string pattern to a form that is | entirely contained in the string to which it is being matched. The mathing | is not commutative. The pattern that you instantiate the class with will be | matched against the input. For example the word "funky" can be made to | match the word "funnybone" with an edit distance of one. However, using | "funnybone" as a pattern that will be matched to "funky" the distance | will become five. | | Example: | | >>> from Apse import Approx | >>> a = Approx("funky") | >>> a.dist("funnybone") | 1 | >>> a = Approx("funnybone") | >>> a.dist("funky") | 5 | | Methods defined here: | | __init__(self, pattern, **kwargs) | Initializes the matching pattern and optional parameters. | The following keywords may be used to control the matching | process: | | edit = number | ins = number | sub = number | dlt = number | | that govern the number of maximal edits, inserions, substractions | and deletions. The numbers must either be an integer in which case | it is interpreted as the absolute number of edits, or a fractional | number less than 1 in which case it is interpreted as a | percentage of the length of the pattern. | | a = Apse.Approx("python", edit=2, ins=1) | | will match only inputs that can be reached by at most two edits | only one of which may be an insertion. | | a = Apse.Approx("python", edit=0.3) | | will match only inputs that can be reached by changing 30% | of the characters in the word python, that is and edit | distance of 2 | | the presence of the ignore_case keyword argument will | lead to matches where the case is ignored. | | dist(self, text) | Returns the edit distance (the number of deletions, insertions and | substitutions needed to match the input string to the pattern | | info(self) | Returns a tuple containing the argument dictionary and the pattern | | init_args(self) | Initializes the matching parameters. | | match(self, text) | Matches an input to the pattern. The input may be a string or | a list containing strings. The method returns the matching entry(ies) | either as a string or a list. | | slice(self, text) | Returns the starting and ending indices of the locations where | the input string matches the pattern plus the edit distance needed | to perform the match