How do I tokenize a string sentence in NLTK?

This is actually on the main page of nltk.org:

>>> import nltk
>>> sentence = """At eight o'clock on Thursday morning
... Arthur didn't feel very good."""
>>> tokens = nltk.word_tokenize(sentence)
>>> tokens
['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',
'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']

More Related Contents:

How to get rid of punctuation using NLTK tokenizer?
How to use Stanford Parser in NLTK using Python
Creating a new corpus with NLTK
Python NLTK pos_tag not returning the correct part-of-speech tag
Stopword removal with NLTK
Ordinal numbers replacement
How to config nltk data directory from code?
Classification using movie review corpus in NLTK/Python
Convert words between verb/noun/adjective forms
NLTK Named Entity recognition to a Python list
Computing N Grams using Python
Python – RegEx for splitting text into sentences (sentence-tokenizing) [duplicate]
Using NLTK and WordNet; how do I convert simple tense verb into its present, past or past participle form?
English grammar for parsing in NLTK
NLTK and language detection
How to apply NLTK word_tokenize library on a Pandas dataframe for Twitter data?
Fast/Optimize N-gram implementations in python
Extract Word from Synset using Wordnet in NLTK 3.0
How do I do dependency parsing in NLTK?
training data format for NLTK punkt
Difference between Python’s collections.Counter and nltk.probability.FreqDist
Saving nltk drawn parse tree to image file
Creating a custom categorized corpus in NLTK and Python
tag generation from a text content
NLTK WordNet Lemmatizer: Shouldn’t it lemmatize all inflections of a word?
What is NLTK POS tagger asking me to download?
Implementing Bag-of-Words Naive-Bayes classifier in NLTK
Fast n-gram calculation
How to use malt parser in python nltk
How to tweak the NLTK sentence tokenizer

More Related Contents:

Leave a Comment Cancel reply