How to speed up Gensim Word2vec model load time?
In recent gensim versions you can load a subset starting from the front of the file using the optional limit parameter to load_word2vec_format(). (The GoogleNews vectors seem to be in roughly most- to least- frequent order, so the first N are usually the N-sized subset you’d want. So use limit=500000 to get the most-frequent 500,000 … Read more