How to read a Parquet file into Pandas DataFrame?

pandas 0.21 introduces new functions for Parquet:

import pandas as pd
pd.read_parquet('example_pa.parquet', engine="pyarrow")

or

import pandas as pd
pd.read_parquet('example_fp.parquet', engine="fastparquet")

The above link explains:

These engines are very similar and should read/write nearly identical parquet format files. These libraries differ by having different underlying dependencies (fastparquet by using numba, while pyarrow uses a c-library).

Leave a Comment