How to remove English words from a file containing Dari words?

You could install and use the nltk library. This provides you with a list of English words and a means to split each line into words:

from nltk.tokenize import word_tokenize
from nltk.corpus import words

english = words.words()

with open('Dari.pos') as f_input, open('DariNER.txt', 'w') as f_output:
    for line in f_input:
        f_output.write(' '.join(word for word in word_tokenize(line) if word.lower() not in english) + '\n')

After installing nltk, you should run:

import nltk
nltk.download()

and use it to download words

More Related Contents:

How to add a label to all words in a file? [closed]
Change sign of elements with an odd sum of indices
Function Arguments – Clarification
How can I read inputs as numbers?
Why is a list comprehension so much faster than appending to a list?
How to print a string at a fixed width?
How to use pip with Python 3.x alongside Python 2.x
Why is parenthesis in print voluntary in Python 2.7?
How can I make a for-loop pyramid more concise in Python? [duplicate]
Backporting Python 3 open(encoding=”utf-8″) to Python 2
pip or pip3 to install packages for Python 3?
Accessing attributes on literals work on all types, but not `int`; why? [duplicate]
operator.itemgetter or lambda
Accessing attributes on literals work on all types, but not `int`; why? [duplicate]
How to print a list with integers without the brackets, commas and no quotes? [duplicate]
How do I run python 2 and 3 in windows 7? [duplicate]
Python3 correct way to import relative or absolute?
Is Python’s order of evaluation of function arguments and operands deterministic (+ where is it documented)?
Pip freeze vs. pip list
PyOpenGL glutInit NullFunctionError
How to apply a function to each sublist of a list in python?
Python 3 urllib produces TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str
How to read the last line of a file in Python? [duplicate]
Multiple keys per value
Type error Unhashable type:set
Loading text file containing both float and string using numpy.loadtxt
What is absolute import in python?
Run process with realtime output to a Tkinter GUI
What does the slice() function do in Python?
Skip elements on a condition based in a list comprehension in python

More Related Contents:

Leave a Comment Cancel reply