Understanding How CQL3 Maps to Cassandra's Internal Data Structure
Join the DZone community and get the full member experience.
Join For FreeThis is just a quick reference for now. Later I hope to flesh out the details and discuss some of the cooler features present here:
Maps
Defining:
cqlsh:test> CREATE TABLE phonelists (... user text PRIMARY KEY,... phoneNumbers map<text,text>);
Inserting:
cqlsh:test> INSERT INTO phonelists (user, phonenumbers)... VALUES ('john',{'patricia':'555-4326','doug':'555-1579'}); cqlsh:test> INSERT INTO phonelists (user, phonenumbers)... VALUES ('scott',{'bill':'555-7382','patricia':'555-4326','jane':'555-8743'});
Retrieving:
cqlsh:test> SELECT * FROM phonelists; user | phonenumbers -------+------------------------------------------------------ scott |{bill:555-7382, jane:555-8743, patricia:555-4326} john |{doug:555-1579, patricia:555-4326}
Updating:
cqlsh:test> UPDATE phonelists ... SET phonenumbers = phonenumbers +{'daniel':'555-0453'}... WHERE user='john';
The internal representation:
[default@test] list phonelists;-------------------RowKey: scott =>(column=, value=, timestamp=1374684062860000)=>(column=phonenumbers:bill, value='555-7382', timestamp=1374684062860000)=>(column=phonenumbers:jane, value='555-8743', timestamp=1374684062860000)=>(column=phonenumbers:patricia, value='555-4326', timestamp=1374684062860000)-------------------RowKey: john =>(column=, value=, timestamp=1374683971220000)=>(column=phonenumbers:doug, value='555-1579', timestamp=1374683971220000)=>(column=phonenumbers:patricia, value='555-4326', timestamp=1374683971220000)
Note that the above text
components, e.g. ‘doug’ and ’555-1579′, are returned by cassandra-cli in their hex encoding. I’ve taken the liberty to decode them so that the print out is more understandable. I will do this throughout the post.
Lists
Defining:
cqlsh:test> CREATE TABLE friendlists (... user text PRIMARY KEY,... friends list <text>...);
Inserting:
cqlsh:test> INSERT INTO friendlists (user, friends)... VALUES ('john',['doug','patricia','scott']); cqlsh:test> INSERT INTO friendlists (user, friends)... VALUES ('patricia',['john','lucifer']);
Retrieving:
cqlsh:test> SELECT * FROM friendlists; user | friends ----------+------------------------- john |[doug, patricia, scott] patricia |[john, lucifer]
Updating:
cqlsh:test> UPDATE friendlists ... SET friends = friends +['matt','eric']... WHERE user='john'; cqlsh:test> UPDATE friendlists ... SET friends = friends -['lucifer']... WHERE user='patricia';
The internal representation:
[default@test] list friendlists;Usingdefault limit of 100Usingdefault column limit of 100-------------------RowKey: john =>(column=, value=, timestamp=1374687324950000)=>(column=friends:26017c10f48711e2801fdf9895e5d0f8, value='doug', timestamp=1374687206993000)=>(column=friends:26017c11f48711e2801fdf9895e5d0f8, value='patricia', timestamp=1374687206993000)=>(column=friends:26017c12f48711e2801fdf9895e5d0f8, value='scott', timestamp=1374687206993000)=>(column=friends:6c504b60f48711e2801fdf9895e5d0f8, value='matt', timestamp=1374687324950000)=>(column=friends:6c504b61f48711e2801fdf9895e5d0f8, value='eric', timestamp=1374687324950000)-------------------RowKey: patricia =>(column=, value=, timestamp=1374687352290000)=>(column=friends:3b817b80f48711e2801fdf9895e5d0f8, value='john', timestamp=1374687243064000)
Here the internal column name is more complicated because a UUID is appended to the name of the CQL field “friend”. This is used to keep track of the order of items in the list.
To be determined: Does a list item delete take more time than a list insert? I suspect so – I don’t see how Cassandra can delete an element in the list without reading in all the elements in the list and then deleting column that has the value indicated in the delete.
Set
Defining:
cqlsh:test> CREATE TABLE friendsets (... user text PRIMARY KEY,... friends set<text>...);
Inserting:
cqlsh:test> INSERT INTO friendsets (user, friends)... VALUES ('john',{'doug','patricia','scott'}); cqlsh:test> INSERT INTO friendsets (user, friends)... VALUES ('patricia',{'john','lucifer'});
Retrieving:
cqlsh:test> SELECT * FROM friendsets; user | friends ----------+------------------------- john |{doug, patricia, scott} patricia |{john, lucifer}
Updating:
cqlsh:test> UPDATE friendsets ... SET friends = friends +{'matt','eric'}... WHERE user='john'; cqlsh:test> UPDATE friendsets ... SET friends = friends -{'lucifer'}... WHERE user='patricia';
The internal representation:
[default@test] list friendsets;Usingdefault limit of 100Usingdefault column limit of 100-------------------RowKey: john =>(column=, value=, timestamp=1374688135443000)=>(column=friends:'doug', value=, timestamp=1374688108307000)=>(column=friends:'eric', value=, timestamp=1374688135443000)=>(column=friends:'matt', value=, timestamp=1374688135443000)=>(column=friends:'patricia', value=, timestamp=1374688108307000)=>(column=friends:'scott', value=, timestamp=1374688108307000)-------------------RowKey: patricia =>(column=, value=, timestamp=1374688151386000)=>(column=friends:'john', value=, timestamp=1374688116595000)
- See more at: http://www.opensourceconnections.com/2013/07/24/understanding-how-cql3-maps-to-cassandras-internal-data-structure-sets-lists-and-maps/#sthash.6nKcPjFs.dpuf
Published at DZone with permission of John Berryman, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
DevOps Pipeline and Its Essential Tools
-
You’ve Got Mail… and It’s a SPAM!
-
The Native Way To Configure Path Aliases in Frontend Projects
-
What Is JHipster?
Comments