Sounds to me, like you just need to use vectorizer.transform
for the test dataset, since the training dataset fixes the vocabulary (you cannot know the full vocabulary including the training set afterall). Just to be clear, thats vectorizer.transform
instead of vectorizer.fit_transform
.