OneHotEncoder categorical_features deprecated, how to transform specific column

There is actually 2 warnings :

FutureWarning: The handling of integer data will change in version
0.22. Currently, the categories are determined based on the range [0, max(values)], while in the future they will be determined based on the
unique values. If you want the future behaviour and silence this
warning, you can specify “categories=”auto””. In case you used a
LabelEncoder before this OneHotEncoder to convert the categories to
integers, then you can now use the OneHotEncoder directly.

and the second :

The ‘categorical_features’ keyword is deprecated in version 0.20 and
will be removed in 0.22. You can use the ColumnTransformer instead.
“use the ColumnTransformer instead.”, DeprecationWarning)

In the future, you should not define the columns in the OneHotEncoder directly, unless you want to use “categories=”auto””. The first message also tells you to use OneHotEncoder directly, without the LabelEncoder first.
Finally, the second message tells you to use ColumnTransformer, which is like a Pipe for columns transformations.

Here is the equivalent code for your case :

from sklearn.compose import ColumnTransformer 
ct = ColumnTransformer([("Name_Of_Your_Step", OneHotEncoder(),[0])], remainder="passthrough")) # The last arg ([0]) is the list of columns you want to transform in this step
ct.fit_transform(X)

For the above example;

Encoding Categorical data (Basically Changing Text to Numerical data i.e, Country Name)

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
#Encode Country Column
labelencoder_X = LabelEncoder()
X[:,0] = labelencoder_X.fit_transform(X[:,0])
ct = ColumnTransformer([("Country", OneHotEncoder(), [0])], remainder="passthrough")
X = ct.fit_transform(X)

More Related Contents:

Leave a Comment Cancel reply