Shortest path to transform one word into another

NEW ANSWER Given the recent update, you could try A* with the Hamming distance as a heuristic. It’s an admissible heuristic since it’s not going to overestimate the distance OLD ANSWER You can modify the dynamic-program used to compute the Levenshtein distance to obtain the sequence of operations. EDIT: If there are a constant number … Read more

String similarity metrics in Python [duplicate]

I realize it’s not the same thing, but this is close enough: >>> import difflib >>> a=”Hello, All you people” >>> b = ‘hello, all You peopl’ >>> seq=difflib.SequenceMatcher(a=a.lower(), b=b.lower()) >>> seq.ratio() 0.97560975609756095 You can make this as a function def similar(seq1, seq2): return difflib.SequenceMatcher(a=seq1.lower(), b=seq2.lower()).ratio() > 0.9 >>> similar(a, b) True >>> similar(‘Hello, world’, … Read more

Similarity scores based on string comparison in R (edit distance)

The function adist computes the Levenshtein edit distance between two strings. This can be transformed into a similarity metric as 1 – (Levenshtein edit distance / longer string length). The levenshteinSim function in the RecordLinkage package also does this directly, and might be faster than adist. library(RecordLinkage) > levenshteinSim(“apple”, “apple”) [1] 1 > levenshteinSim(“apple”, “aaple”) … Read more