Typically this is accomplished by finding the Longest Common Subsequence (commonly called the LCS problem). This is how tools like diff
work. Of course, diff
is a line-oriented tool, and it sounds like your needs are somewhat different. However, I’m assuming that you’ve already constructed some way to compare words and sentences.