--

My initial guess is that your X_train 's categorical features don't contain all possible value as found in dataset 's categorical features, so when you perform one hot encoding on X_train, the new columns is less than one hot encoding on the entire dataset.

--

--

Yannawut Kimnaruk
Yannawut Kimnaruk

No responses yet