How to upload a CSV file in FastAPI and convert it into a Pandas Dataframe?

Below are given various options on how to convert an uploaded file to FastAPI into a Pandas DataFrame. If you would also like to convert the DataFrame into JSON and return it to the client, have a look at this answer. If you would like to use an async def endpoint instead of def, please have a look at this answer on how to read the file contents in an async way, as well as this answer to understand the difference between using def and async def. It would also be best to enclose the I/O operations (in the examples below) in a try-except-finally block (as shown here and here), so that you can catch/raise any possible exceptions and close the file properly, in order to release the object from memory and avoid potential errors.

Related answers on how to upload and read a CSV file can be found here (gives examples using Jinja2 Templates), as well as here (converts the uploaded CSV file into JSON and returns it to the client) and here (provides solutions without using external libraries).

Option 1

Since pandas.read_csv() can accept a file-like object, you can pass the file-like object of UploadFile directly. UploadFile exposes an actual Python SpooledTemporaryFile that you can get using the .file attribute. Example is given below. Note: The pd.read_csv() isn’t an async method, and hence, if you are about to use async def endpoint, it would be better to read the contents of the file using an async method, as described here, and then pass the contents to pd.read_csv() using one of the reamining options below. Alternatively, you can use Starlette’s run_in_threadpool() (as described here), which will run the pd.read_csv(file.file) in a separate thread to ensure that the main thread (where coroutines are run) does not get blocked.

from fastapi import FastAPI, File, UploadFile
import pandas as pd

app = FastAPI()

@app.post("/upload")
def upload_file(file: UploadFile = File(...)):
    df = pd.read_csv(file.file)
    file.file.close()
    return {"filename": file.filename}

Option 2

Convert the bytes into a string and then load it into an in-memory text buffer (i.e., StringIO), which can be converted into a dataframe:

from fastapi import FastAPI, File, UploadFile
import pandas as pd
from io import StringIO

app = FastAPI()

@app.post("/upload")
def upload_file(file: UploadFile = File(...)):
    contents = file.file.read()
    s = str(contents,'utf-8')
    data = StringIO(s) 
    df = pd.read_csv(data)
    data.close()
    file.file.close()
    return {"filename": file.filename}

Option 3

Use an in-memory bytes buffer instead (i.e., BytesIO), thus saving you the step of converting the bytes into a string as shown in Option 2:

from fastapi import FastAPI, File, UploadFile
import pandas as pd
from io import BytesIO
import uvicorn

app = FastAPI()

@app.post("/upload")
def upload_file(file: UploadFile = File(...)):
    contents = file.file.read()
    data = BytesIO(contents)
    df = pd.read_csv(data)
    data.close()
    file.file.close()
    return {"filename": file.filename}

Leave a Comment