Over a million developers have joined DZone.

Seven Databases in Seven Weeks: Hbase, Day 1

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

Hbase is a columnar NoSQL database. The first day of Hbase was short and clear. Installing it was easy. No issues whatsoever. The examples simulated some wiki pages with revisions. It was fairly easy.

Installation

I found a really easy tutorial on how to install Hbase on Fedora:
http://tutorialforlinux.com/2014/03/18/how-to-getting-started-with-apache-hbase-on-fedora-19-20-21-3264bit-linux-easy-guide/

Hbase will usually work on several (many) servers. It is recommended to run it with at least 5 machines.

However, it’s possible to run it on a single machine for POC / learning purposes. I am using an old, weak laptop, and Hbase works just fine.

JRuby Script

Part of the learning consists of understanding JRuby, as some scripts and exercises use it.

To load a JRuby script into the Hbase shell, run something like:

/opt/hbase-latest/bin/hbase org.jruby.Main PATH-TO-SCRIPT

The example script: put_multiple_columns initially didn’t work. I think it’s due to different versions.

In the book’s forum I found a similar question and an answer for that problem:
http://forums.pragprog.com/forums/202/topics/11494

I uploaded the working script to GitHub: GitHub-put_multiple_columns.rb

Day 1 Material

Under GitHub, some links, material and homework answers.
https://github.com/eyalgo/seven-dbs-in-seven-weeks/tree/master/hbase/day_1

Day 1 Homework

The exercise is more of a JRuby / Ruby and less of Hbase.

put_many.rb
def put_many( table_name, row, column_values )
  import 'org.apache.hadoop.hbase.client.HTable'
  import 'org.apache.hadoop.hbase.client.Put'
  import 'org.apache.hadoop.hbase.HBaseConfiguration'
 
  def jbytes( *args )
    args.map { |arg| arg.to_s.to_java_bytes }
  end
 
  puts( @hbase )
  conf = HBaseConfiguration.new
  table = HTable.new( conf, table_name )
  p = Put.new( *jbytes( row ) )
   
  column_values.each do |key, value|
    (key_family, key_name) = key.split(':')
    key_name ||= ""
    p.add( *jbytes( key_family, key_name, value ))
  end
   
  table.put( p )
end

Day 2, working with big data looks really interesting…

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:

Published at DZone with permission of Eyal Golan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}