Python Pandas read_csv skip rows but keep header

You can pass a list of row numbers to skiprows instead of an integer.

By giving the function the integer 10, you’re just skipping the first 10 lines.

To keep the first row 0 (as the header) and then skip everything else up to row 10, you can write:

pd.read_csv('test.csv', sep='|', skiprows=range(1, 10))

Other ways to skip rows using `read_csv`

The two main ways to control which rows read_csv uses are the header or skiprows parameters.

Supose we have the following CSV file with one column:

a
b
c
d
e
f

In each of the examples below, this file is f = io.StringIO("\n".join("abcdef")).

Read all lines as values (no header, defaults to integers)

>>> pd.read_csv(f, header=None)
   0
0  a
1  b
2  c
3  d
4  e
5  f

Use a particular row as the header (skip all lines before that):
```
>>> pd.read_csv(f, header=3)
   d
0  e
1  f
```

Use a multiple rows as the header creating a MultiIndex (skip all lines before the last specified header line):

>>> pd.read_csv(f, header=[2, 4])                                                                                                                                                                        
   c
   e
0  f

Skip N rows from the start of the file (the first row that’s not skipped is the header):

>>> pd.read_csv(f, skiprows=3)                                                                                                                                                                      
   d
0  e
1  f

Skip one or more rows by giving the row indices (the first row that’s not skipped is the header):

>>> pd.read_csv(f, skiprows=[2, 4])                                                                                                                                                                      
   a
0  b
1  d
2  f

Other ways to skip rows using read_csv

More Related Contents:

Leave a Comment Cancel reply

Other ways to skip rows using `read_csv`