Working with CSV Files in Python
CSV (Comma-Separated Values) is one of the most common formats for storing and exchanging tabular data. It’s a simple text-based format that’s easy to read for both humans and machines, making it a universal standard for data interchange between applications like spreadsheets (Excel, Google Sheets) and databases.
A typical CSV file looks like this:
Name,Age,City
Alice,30,New York
Bob,25,Los Angeles
Charlie,35,ChicagoThe first line is often a header row that describes the columns. Each subsequent line is a record or row.
The csv Module: Python’s Built-in Tool
While you could try to parse CSV files by manually splitting strings on commas, this approach is brittle and will fail with more complex data (e.g., if a data field itself contains a comma).
Python’s built-in csv module is the robust and correct way to handle CSV data. It gracefully manages quoting, delimiters, and different dialects of CSV.
To use the module, you must first import it: import csv.
Reading CSV Files
The csv module provides two main ways to read data: as lists of strings or as dictionaries.
csv.reader: Reading Rows as Lists
The csv.reader object takes a file object and lets you iterate over the rows in the CSV file. Each row is returned as a list of strings.
Pyground
Create a CSV file named `employees.csv` and then read it using `csv.reader`.
Expected Output:
Header: ['Name', 'Department', 'Salary']\n--- Employees ---\nJohn Doe in Engineering earns $90000\nJane Smith in Marketing earns $85000\nPeter Jones in Engineering earns $92000
Output:
The newline='' Argument
When opening a CSV file, it is crucial to specify newline=''. The csv module handles its own line endings. If you omit this, you may get extra blank rows in your output on some platforms (like Windows) due to how text file translation works.
csv.DictReader: Reading Rows as Dictionaries
A more convenient way to read CSV data is with csv.DictReader. It automatically uses the first row as keys for a dictionary, making your code much more readable and robust against changes in column order.
Pyground
Read the `employees.csv` file again, this time using `csv.DictReader`.
Expected Output:
--- Employee Details ---\nJohn Doe in Engineering earns $90000\nJane Smith in Marketing earns $85000\nPeter Jones in Engineering earns $92000
Output:
Writing CSV Files
The csv module also provides corresponding objects for writing data.
csv.writer: Writing from Lists
The csv.writer object allows you to write data row by row.
writerow(list): Writes a single row.writerows(list_of_lists): Writes multiple rows at once.
Pyground
Create a new CSV file `products.csv` and write a list of product data to it.
Expected Output:
products.csv has been created.
Output:
csv.DictWriter: Writing from Dictionaries
For more structured writing, csv.DictWriter is ideal. It requires you to define the fieldnames (the header) and then you can write rows from a list of dictionaries.
Pyground
Write a list of dictionaries to a new file `contacts.csv` using `DictWriter`.
Expected Output:
contacts.csv has been created.
Output:
Handling Different CSV “Dialects”
Not all CSV files are separated by commas. Some use tabs (\t), semicolons (;), or pipes (|). The csv module handles this easily with the delimiter argument.
Pyground
Create a TSV (Tab-Separated Values) file and read it by specifying the delimiter.
Expected Output:
['Fruit', 'Color', 'Taste']\n['Apple', 'Red', 'Sweet']\n['Lemon', 'Yellow', 'Sour']