Handling Variable Number of Columns with Pandas - Python

One way which seems to work (at least in 0.10.1 and 0.11.0.dev-fc8de6d):

>>> !cat ragged.csv
1,2,3
1,2,3,4
1,2,3,4,5
1,2
1,2,3,4
>>> my_cols = ["A", "B", "C", "D", "E"]
>>> pd.read_csv("ragged.csv", names=my_cols, engine="python")
   A  B   C   D   E
0  1  2   3 NaN NaN
1  1  2   3   4 NaN
2  1  2   3   4   5
3  1  2 NaN NaN NaN
4  1  2   3   4 NaN

Note that this approach requires that you give names to the columns you want, though. Not as general as some other ways, but works well enough when it applies.

More Related Contents:

Filter pandas DataFrame by substring criteria
Adding value labels on a matplotlib bar chart
Pandas column of lists, create a row for each list element
Compare two DataFrames and output their differences side-by-side
Selecting a row of pandas series/dataframe by integer index
pandas: merge (join) two data frames on multiple columns
Remove duplicates from dataframe, based on two columns A,B, keeping row with max value in another column C
How to count duplicate rows in pandas dataframe?
Concatenate a list of pandas dataframes together
How do I convert strings in a Pandas data frame to a ‘date’ data type?
How to drop duplicates based on two or more subsets criteria in Pandas data-frame
How to normalize json correctly by Python Pandas
Replace invalid values with None in Pandas DataFrame
How to show all columns’ names on a large pandas dataframe?
Pandas monthly rolling operation
Count number of words per row
How to select all columns except one in pandas?
Shift column in pandas dataframe up by one?
In Pandas, does .iloc method give a copy or view?
Move non-empty cells to the left in pandas DataFrame
pandas logical and operator with and without brackets produces different results [duplicate]
How to json_normalize a column with NaNs
HDF5 – concurrency, compression & I/O performance [closed]
Convert Pandas Series to DateTime in a DataFrame
How to use a conditional statement based on DataFrame boolean value in pandas
Stacked bar chart in Seaborn
How to add/subtract time (hours, minutes, etc.) from a Pandas DataFrame.Index whos objects are of type datetime.time?
How to easily share a sample dataframe using df.to_dict()
Remove substring from column based on another column
plotting value_counts() in seaborn barplot

Handling Variable Number of Columns with Pandas – Python

Leave a Comment Cancel reply