To read CSV files in Python, you can use several approaches depending on the complexity of your task and the libraries you're using. The most commonly used library for reading CSV files is the built-in csv module, but for more advanced operations, the pandas library is also popular.
The csv module is a built-in Python library that provides functionality for reading and writing CSV files. Let’s start by reading a simple CSV file.
Name, Age, City
John, 25, New York
Anna, 30, Paris
Mike, 22, London
import csv
# Open the csv file in read mode
with open('python.csv', mode='r') as file:
# Create a csv reader object
csv_reader = csv.reader(file)
# Skip the header row if necessary
next(csv_reader)
# Read the rest of the rows
for row in csv_reader:
print(f"Name: {row[0]}, Age: {row[1]}, City:
{row[2]}")
• The csv.reader() function is used to read data from the CSV file.
• next(csv_reader) is used to skip the header row (optional, based on the CSV structure).
• We then iterate over the rows and print each row's data.
Name: John, Age: 25, City: New York
Name: Anna, Age: 30, City: Paris
Name: Mike, Age: 22, City: London
If you prefer to access the CSV data as a dictionary, where the column names are the keys, you can use the csv.DictReader() method.
import csv
# Open the csv file
with open('python.csv', mode='r') as file:
# Create a DictReader object
csv_dict_reader = csv.DictReader(file)
# Iterate over the rows as dictionaries
for row in csv_dict_reader:
print(f"Name: {row['Name']}, Age: {row['Age']},
City: {row['City']}")
• csv.DictReader() reads each row as an OrderedDict where keys are the column names, and values are the row data.
Name: John, Age: 25, City: New York
Name: Anna, Age: 30, City: Paris
Name: Mike, Age: 22, City: London
For more advanced data manipulation, you can use the pandas library, which makes working with tabular data much simpler and efficient.
First, install pandas if you don’t have it installed:
pip install pandas
import pandas as pd
# Read the csv file into a pandas DataFrame
df = pd.read_csv('python.csv')
# Display the DataFrame
print(df)
Name Age City
0 John 25 New York
1 Anna 30 Paris
2 Mike 22 London
• pd.read_csv() reads the CSV file into a DataFrame, a powerful tabular data structure.
• This allows you to perform complex operations on the data such as filtering, grouping, and plotting.
| Function/Constant | Description |
|---|---|
| csv.reader | Reads the data from a CSV file. It returns an iterator object that can iterate over the lines in a given CSV file. |
| csv.writer | Writes data to a CSV file. It creates a writer object, which writes rows to the specified CSV file. |
| csv.field_size_limit | Returns the current maximum field size allowed by the parser. It can be useful when working with very large fields. |
| csv.get_dialect | Returns the dialect associated with a given name. Dialects define various formatting rules for CSV files. |
| csv.list_dialects | Returns the names of all registered dialects. Dialects can be customized to handle different CSV formats. |
| csv.register_dialect | Associates a dialect with a name. The dialect must follow specific formatting rules, and the name must be a string. |
| csv.unregister_dialect | Removes a dialect from the registry by its name. If the name is not found, an error is raised. |
| csv.QUOTE_ALL | Instructs writer objects to quote all fields when writing to a CSV file, regardless of their content. |
| csv.QUOTE_MINIMAL | Instructs writer objects to quote only those fields that contain special characters (like delimiters or quotes). |
| csv.QUOTE_NONNUMERIC | Instructs writer objects to quote all non-numeric fields. Numeric fields remain unquoted. |
| csv.QUOTE_NONE | Instructs writer objects to never quote any fields, even if they contain special characters. |