Scikit learn – fit_transform on the test set
You are not supposed to do fit_transform on your test data, but only transform. Otherwise, you will get different vectorization than the one used during training. For the memory issue, I recommend TfIdfVectorizer, which has numerous options of reducing the dimensionality (by removing rare unigrams etc.). UPDATE If the only problem is fitting test data, … Read more