Word count from a txt file program

The funny symbols you’re encountering are a UTF-8 BOM (Byte Order Mark). To get rid of them, open the file using the correct encoding (I’m assuming you’re on Python 3):

file = open(r"D:\zzzz\names2.txt", "r", encoding="utf-8-sig")

Furthermore, for counting, you can use collections.Counter:

from collections import Counter
wordcount = Counter(file.read().split())

Display them with:

>>> for item in wordcount.items(): print("{}\t{}".format(*item))
...
snake   1
lion    2
goat    2
horse   3

Leave a Comment