Quick implementation of character n-grams for word
To generate bigrams: In [8]: b=’student’ In [9]: [b[i:i+2] for i in range(len(b)-1)] Out[9]: [‘st’, ‘tu’, ‘ud’, ‘de’, ‘en’, ‘nt’] To generalize to a different n: In [10]: n=4 In [11]: [b[i:i+n] for i in range(len(b)-n+1)] Out[11]: [‘stud’, ‘tude’, ‘uden’, ‘dent’]