DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • Redefining DevOps: The Transformative Power of Containerization
  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes
  • A Data-Driven Approach to Application Modernization
  • What Is React? A Complete Guide

Trending

  • Redefining DevOps: The Transformative Power of Containerization
  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes
  • A Data-Driven Approach to Application Modernization
  • What Is React? A Complete Guide
  1. DZone
  2. Coding
  3. Languages
  4. Using Python to Extract Excel Spreadsheet Into CSV Files

Using Python to Extract Excel Spreadsheet Into CSV Files

I want to create a CSV file for each Excel sheet so that I can import the data set into Neo4j using the LOAD CSV command.

Mark Needham user avatar by
Mark Needham
·
Aug. 22, 15 · Tutorial
Like (4)
Save
Tweet
Share
58.62K Views

Join the DZone community and get the full member experience.

Join For Free

I’ve been playing around with the Road Safety open data set and the download comes with several CSV files and an Excel spreadsheet containing the legend.

There are 45 sheets in total and each of them looks like this:

2015 08 17 23 33 19

I wanted to create a CSV file for each sheet so that I can import the data set into Neo4j using the LOAD CSV command.

I came across the Python Excel website which pointed me at the xlrd library since I’m working with a pre 2010 Excel file.


The main documentation is very extensive but I found the github example much easier to follow.

I ended up with the following script which iterates through all but the first two sheets in the spreadsheet – the first two sheets contain instructions rather than data:

from xlrd import open_workbook
import csv

wb = open_workbook('Road-Accident-Safety-Data-Guide-1979-2004.xls')

for i in range(2, wb.nsheets):
    sheet = wb.sheet_by_index(i)
    print sheet.name
    with open("data/%s.csv" %(sheet.name.replace(" ","")), "w") as file:
        writer = csv.writer(file, delimiter = ",")
        print sheet, sheet.name, sheet.ncols, sheet.nrows

        header = [cell.value for cell in sheet.row(0)]
        writer.writerow(header)

        for row_idx in range(1, sheet.nrows):
            row = [int(cell.value) if isinstance(cell.value, float) else cell.value
                   for cell in sheet.row(row_idx)]
            writer.writerow(row)

I’ve replaced spaces in the sheet name so that the file name on a disk is a bit easier to work with. For some reason the numeric values were all floats whereas I wanted them as ints so I had to explicitly apply that transformation.

Here are a few examples of what the CSV files look like:

$ cat data/1stPointofImpact.csv
code,label
0,Did not impact
1,Front
2,Back
3,Offside
4,Nearside
-1,Data missing or out of range

$ cat data/RoadType.csv
code,label
1,Roundabout
2,One way street
3,Dual carriageway
6,Single carriageway
7,Slip road
9,Unknown
12,One way street/Slip road
-1,Data missing or out of range

$ cat data/Weather.csv
code,label
1,Fine no high winds
2,Raining no high winds
3,Snowing no high winds
4,Fine + high winds
5,Raining + high winds
6,Snowing + high winds
7,Fog or mist
8,Other
9,Unknown
-1,Data missing or out of range

And that’s it. Not too difficult!

CSV Python (language) Extract

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Redefining DevOps: The Transformative Power of Containerization
  • Auto-Scaling Kinesis Data Streams Applications on Kubernetes
  • A Data-Driven Approach to Application Modernization
  • What Is React? A Complete Guide

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: