Over a million developers have joined DZone.
Platinum Partner

csvfilter - A Python Command-line Tool for Manipulating CSV Data

· Web Dev Zone

The Web Dev Zone is brought to you in partnership with Mendix.  Discover how IT departments looking for ways to keep up with demand for business apps has caused a new breed of developers to surface - the Rapid Application Developer.


You want a unix-like tool for manipulating CSV data from the command-line.

The standard tools cut and awk aren't always suitable as they don't handle quoting and escaping which are common in CSVs.


Use the CSV manipulation function csvfilter, a simple Python library I've put together.

Install with:

pip install csvfilter

Sample usage:

# Pluck columns 2, 5 and 6
cat in.csv | csvfilter -f 2,5,6 > out.csv

# Pluck all columns except 4
cat in.csv | csvfilter -f 4 -i > out.csv

# Skip header row
cat in.csv | csvfilter -s 1 > out.csv

# Work with pipe-separated data
cat in.csv | csvfilter -s 1,3 --delimiter="|" > out.csv

The above examples show csvfilter processing sys.STDIN but it can also act directly on a file:

csvfilter -f 2,5,6 in.csv  > out.csv


$ csvfilter --help
Usage: csvfilter [options]

-h, --help            show this help message and exit
-f FIELDS, --fields=FIELDS
                        Specify which fields to pluck
-s SKIP, --skip=SKIP  Number of rows to skip
                        Delimiter of incoming CSV data
-i, --inverse         Invert the filter - ie drop the selected fields
                        Delimiter to use for output
                        Quote character to use for output

There is also a simple python API that allows you to add validators to determine which rows are filtered out:

from csvfilter import Processor

def contains_cheese(row):
    return 'cheese' in row

processor = Processor(fields=[1,2,3])
generator = processor.process(sys.stdin)

for cheesy_row in generator:


It's possible to do basic CSV manipulation from the command-line using cut or awk - for example:

cat in.csv | cut -d "," -f 0,1,2 > out.csv

or :

cat in.csv | awk 'BEGIN {FS=","} {print $1,$2,$3}' > out.csv

However neither cut or awk make it easy to handle CSVs with escaped characters - hence the motivation for this tool.

I'm not the first to write such a utility - there are several others out there (although none had quite the API I was looking for):

Source available on Github.


The Web Dev Zone is brought to you in partnership with Mendix.  Learn more about The Essentials of Digital Innovation and how it needs to be at the heart of every organization.


Published at DZone with permission of David Winterbottom , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}