The short answer is IEEE 754 specifies NaN
as a float
value.
As for what you should do about converting a pd.Series
to specific numeric data types, I prefer to use pd.to_numeric
where possible. The below examples demonstrate why.
import pandas as pd
import numpy as np
s = pd.Series([1, 2.5, 3, 4, 5.5]) # s.dtype = float64
s = s.astype(float) # s.dtype = float64
s = pd.to_numeric(s, downcast="float") # s.dtype = float32
t = pd.Series([1, np.nan, 3, 4, 5]) # s.dtype = float64
t = t.astype(int) # ValueError
t = pd.to_numeric(t, downcast="integer") # s.dtype = float64
u = pd.Series([1, 2, 3, 4, 5, 6]) # s.dtype = int64
u = u.astype(int) # s.dtype = int32
u = pd.to_numeric(u, downcast="integer") # s.dtype = int8