csvfilter - A Python Command-line Tool for Manipulating CSV Data
Join the DZone community and get the full member experience.
Join For FreeProblem
You want a unix-like tool for manipulating CSV data from the command-line.
The standard tools cut and awk aren't always suitable as they don't handle quoting and escaping which are common in CSVs.
Solution
Use the CSV manipulation function csvfilter, a simple Python library I've put together.
Install with:
pip install csvfilter
Sample usage:
# Pluck columns 2, 5 and 6 cat in.csv | csvfilter -f 2,5,6 > out.csv # Pluck all columns except 4 cat in.csv | csvfilter -f 4 -i > out.csv # Skip header row cat in.csv | csvfilter -s 1 > out.csv # Work with pipe-separated data cat in.csv | csvfilter -s 1,3 --delimiter="|" > out.csv
The above examples show csvfilter processing sys.STDIN but it can also act directly on a file:
csvfilter -f 2,5,6 in.csv > out.csv
Help:
$ csvfilter --help Usage: csvfilter [options] Options: -h, --help show this help message and exit -f FIELDS, --fields=FIELDS Specify which fields to pluck -s SKIP, --skip=SKIP Number of rows to skip -d DELIMITER, --delimiter=DELIMITER Delimiter of incoming CSV data -i, --inverse Invert the filter - ie drop the selected fields --out-delimiter=OUT_DELIMITER Delimiter to use for output --out-quotechar=OUT_QUOTECHAR Quote character to use for output
There is also a simple python API that allows you to add validators to determine which rows are filtered out:
from csvfilter import Processor def contains_cheese(row): return 'cheese' in row processor = Processor(fields=[1,2,3]) processor.add_validator(contains_cheese) generator = processor.process(sys.stdin) for cheesy_row in generator: do_something(cheesy_row)
Discussion
It's possible to do basic CSV manipulation from the command-line using cut or awk - for example:
cat in.csv | cut -d "," -f 0,1,2 > out.csv
or :
cat in.csv | awk 'BEGIN {FS=","} {print $1,$2,$3}' > out.csv
However neither cut or awk make it easy to handle CSVs with escaped characters - hence the motivation for this tool.
I'm not the first to write such a utility - there are several others out there (although none had quite the API I was looking for):
Published at DZone with permission of David Winterbottom, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments