sklearn.LabelEncoder with never seen before values

LabelEncoder is basically a dictionary. You can extract and use it for future encoding:

from sklearn.preprocessing import LabelEncoder

le = preprocessing.LabelEncoder()
le.fit(X)

le_dict = dict(zip(le.classes_, le.transform(le.classes_)))

Retrieve label for a single new item, if item is missing then set value as unknown

le_dict.get(new_item, '<Unknown>')

Retrieve labels for a Dataframe column:

df[your_col] = df[your_col].apply(lambda x: le_dict.get(x, <unknown_value>))

Leave a Comment