How to read a Parquet file into Pandas DataFrame?

pandas 0.21 introduces new functions for Parquet:

import pandas as pd
pd.read_parquet('example_pa.parquet', engine="pyarrow")

import pandas as pd
pd.read_parquet('example_fp.parquet', engine="fastparquet")

The above link explains:

These engines are very similar and should read/write nearly identical parquet format files. These libraries differ by having different underlying dependencies (fastparquet by using numba, while pyarrow uses a c-library).

More Related Contents:

error when applying function to a series object
How to identify and label similar rows in a pandas data frame
Renaming column names in Pandas
Groupby value counts on the dataframe pandas
How can I use the apply() function for a single column?
Pandas filtering for multiple substrings in series
Filtering Pandas DataFrames on dates
Pandas DataFrame Groupby two columns and get counts
How can I map True/False to 1/0 in a Pandas DataFrame?
Selection with .loc in python
Pandas deleting row with df.drop doesn’t work
Pandas select rows and columns based on boolean condition
How to flatten a hierarchical index in columns
How can I subclass a Pandas DataFrame?
Vectorized lookup on a pandas dataframe
Add a sequential counter column on groups to a pandas dataframe
Pandas: Shift down values by one row within a group
Python: Pandas dataframe from Series of dict
Move column by name to front of table in pandas
Convert List to Pandas Dataframe Column
How to merge a Series and DataFrame
Pretty print a pandas dataframe in VS Code
dataframe to dict such that one column is the key and the other is the value [duplicate]
Transpose the data in a column every nth rows in PANDAS
Is there a way to speed up handling large CSVs and dataframes in python?
Convert pandas DataFrame into list of lists [duplicate]
Got continuous is not supported error in RandomForestRegressor
Copy text between parentheses in pandas DataFrame column into another column
Get frequency of item occurrences in a column as percentage [duplicate]
Get max value from row of a dataframe in python [duplicate]

More Related Contents:

Leave a Comment Cancel reply