With pandas version 0.16.1
and up, there is now a DataFrame.sample
method built-in:
import pandas
df = pandas.DataFrame(pandas.np.random.random(100))
# Randomly sample 70% of your dataframe
df_percent = df.sample(frac=0.7)
# Randomly sample 7 elements from your dataframe
df_elements = df.sample(n=7)
For either approach above, you can get the rest of the rows by doing:
df_rest = df.loc[~df.index.isin(df_percent.index)]
Per Pedram
‘s comment, if you would like to get reproducible samples, pass the random_state
parameter.
df_percent = df.sample(frac=0.7, random_state=42)