How to Export a DataFrame to CSV with Pandas in Python

Published Nov 10, 2021

How can we store a DataFrame object into a .csv file in Pandas?

We can achieve this with the to_csv() function, which provides plenty of parameters to suit our use cases.


Suppose we have a DataFrame object df that looks like this.

        Dog  Age
0     corgi    7
1  shih tzu    5

This to_csv() function will produce a CSV file with the following:

1,shih tzu,5

Write without index

By default, the first column of our CSV will contain the index of each row in df.

If we don’t want to include the index in our CSV, we can set the index parameter to be False.

df.to_csv("filename.csv", index=False)

Our CSV file would then look something like this.

shih tzu,5

All subsequent examples will assume index=False.

Use a different separator

The default separator is a comma ,.

We can change this using the sep parameter. For instance, we can delimit by tab or semicolon.

df.to_csv("filename.csv", sep='\t') # Tab
df.to_csv("filename.csv", sep=';')  # Semicolon

Here’s what the semicolon delimiter would look like.

shih tzu;5

Write without header

If we don’t want to include the header in our CSV file, we can set header to False.

df.to_csv("filename.csv", header=False)

Let’s take out that header.

shih tzu,5

Write specific columns

If we want to only write a subset of columns to our CSV, we can specify the columns as a list of strings in the columns parameter.

df.to_csv("filename.csv", columns=['Dog'])

Only dogs, no age.

shih tzu

Change file format encoding

To be safe, we can set the encoding parameter so other applications know how to read our CSV file.

df.to_csv("filename.csv", encoding='utf-8')

Compress a CSV

If we’re writing hundreds of thousands of rows into a CSV file, it might be best to compress the CSV.

The output zip or gzip file will be smaller, but the write and read times will involve compressing and decompressing, making the process take a bit longer.

df.to_csv("filename.gzip", compression='gzip')

Replace NaN with string

We can replace all instances of NaN in df with a string like "N/A" using the na_rep parameter.

df.to_csv("filename.csv", na_rep='N/A')

Format datetime columns

We can specify a format for all datetime columns using the date_format column.

df.to_csv("filename.csv", date_format='%m/%d/%Y')

More Python Articles