Quick implementation of character n-grams for word

To generate bigrams:

In [8]: b='student'

In [9]: [b[i:i+2] for i in range(len(b)-1)]
Out[9]: ['st', 'tu', 'ud', 'de', 'en', 'nt']

To generalize to a different n:

In [10]: n=4

In [11]: [b[i:i+n] for i in range(len(b)-n+1)]
Out[11]: ['stud', 'tude', 'uden', 'dent']

Leave a Comment