Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

CSV Tooling for Migrating to Couchbase From Relational

DZone's Guide to

CSV Tooling for Migrating to Couchbase From Relational

Learn how CSV can be exported from a relational database and then imported into Couchbase Server with the cbimport utility.

· Database Zone ·
Free Resource

Discover Tarantool's unique features which include powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU core!

CSV (Comma-seperated values) is a file format that can be exported from a relational database (like Oracle or SQL Server). It can then be imported into Couchbase Server with the cbimport utility.

Note: cbimport comes with Couchbase Enterprise Edition. For Couchbase Community Edition, you can use the more limited cbtransfer tool or go with cbdocloader if JSON is an option.

A straight relational > CSV > Couchbase ETL probably isn't going to be the complete solution for data migration. In a later post, I'll write about data modeling decisions that you'll have to consider. But it's a starting point: consider this data as "staged."

Note: For this post, I'm using SQL Server and a Couchbase Server cluster, both installed locally. The steps will be similar for SQL Server, Oracle, MySQL, PostgreSQL, etc.

Export to CSV

The first thing you need to do is export to CSV. I have a relational database with two tables: Invoices and InvoiceItems.

I'm going to export the data from these two tables into two CSV files. With SQL Server Management Studio, this can be done a number of different ways. You can use sqlcmd or bcp at the command line. Or you can use Powershell's Invoke-Sqlcmd and pipe it through Export-Csv. You can also use the SQL Server Management Studio UI.

Other relational databases will have command line utilities, UI tools, etc to export CSV.

Here is an example of a CSV export from a table called Invoices:

Id,InvoiceNum,InvoiceDate,BillTo,ShipTo
1,ABC123,2018-01-15 00:00:00.000,Lynn Hess,"Herman Trisler, 4189 Oak Drive"
2,XYZ987,2017-06-23 00:00:00.000,Yvonne Pollak,"Clarence Burton, 1470 Cost Avenue"
3,FOO777,2018-01-02 00:00:00.000,Phillip Freeman,"Ronda Snell, 4685 Valley Lane"

Here's an export from a related table called InvoiceItems:

InvoiceId,Product,Quantity,Price
1,Tire,2,20.00
1,Steering Wheel,5,10.00
1,Engine Oil,10,15.00
1,Brake Pad,24,1000.00
2,Mouse pad,1,3.99
2,Mouse,1,14.99
2,Computer monitor,1,199.98
3,Cupcake,12,.99
3,Birthday candles,1,.99
3,Delivery,1,30.00

Load CSV Into Couchbase

Let's import these into a Couchbase bucket. I'll assume you've already created an empty bucket named Staging.

First, let's import invoices.csv.

Loading Invoices

C:\Program Files\Couchbase\Server\bin\cbimport csv -c localhost -u Administrator -p password -b staging -d file://invoices.csv --generate-key invoice::%Id%

Note: With Linux/Mac, instead of C:\Program Files\Couchbase\Server\bin, the path will be different.

Let's break this down:

  • cbimport: This is the command line utility you're using.
  • csv: We're importing from a CSV file. You can also import from JSON files.
  • -c localhost: The location of your Couchbase Server cluster.
  • -u Administrator -p password: Credentials for your cluster. Hopefully, you have more secure credentials than this example!
  • -b staging: The name of the Couchbase bucket you want the data to end up in.
  • -generate-key invoice::%Id% The template that will be used to create unique keys in Couchbase. Each line of the CSV will correspond to a single document. Each document needs a unique key. I decided to use the primary key (integer) with a prefix indicating that it's an invoice document.

The end result of importing a three-line file is three documents:

At this point, the staging bucket only contains invoice documents, so you may want to perform transformations now. I may do this in later modeling examples, but for now, let's move on to the next file.

Loading Invoice Items

C:\Program Files\Couchbase\Server\bin\cbimport csv -c localhost -u Administrator -p password -b staging -d file://invoice_items.csv --generate-key invoiceitem::#UUID#

This is nearly identical to the last import. One difference is that it's a new file (invoice_items.csv). But the most important difference is -generate-key. These records only contain foreign keys, but each document in Couchbase must have a unique key. Ultimately, we may decide to embed these records into their parent Invoice documents. But for now, I decided to use UUID to generate unique keys for the records.

The end result of importing this 10 line file is ten more documents:

What's Next?

Once you have a CSV file, it's very easy to get data into Couchbase. However, this sort of direct translation is often not going to be enough on its own. I've explored some aspects of data modeling in a previous blog post on migrating from SQL Server, but I will revisit this Invoices example in a refresher blog post soon.

In the meantime, be sure to check out How Couchbase Beats Oracle for more information on why companies are replacing Oracle for certain use cases. And also take a look at the Moving From Relational to NoSQL: How to Get Started white paper.

If you have any questions or comments, please feel free to leave them here, contact me on Twitter @mgroves or ask your question in the Couchbase Forums.

Discover Tarantool's unique features such as powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU.

Topics:
nosql ,couchbase ,relational databases ,csv ,database ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}