Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Associative Data Modeling Demystified: Part II

DZone's Guide to

Associative Data Modeling Demystified: Part II

In this post, we demonstrate how Topic Map data model represents associations. In order to link the two, we continue with another SQL query from our relational database.

Free Resource

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

Association in Topic Map Data Model

Introduction

In the previous article of this series, we examined the association construct from the perspective of Entity-Relationship data model. In this post, we demonstrate how Topic Map data model represents associations. In order to link the two, we continue with another SQL query from our relational database.

SELECT suppliers.sid,
       suppliers.sname,
       suppliers.scountry,
       catalog.catcost,
       catalog.catqnt,
       parts.pid,
       parts.pname,
       parts.pcolor
FROM   suppliers
       INNER JOIN (parts
                   INNER JOIN [catalog]
                           ON parts.pid = catalog.catpid)
               ON suppliers.sid = catalog.catsid
WHERE  (( ( parts.pid ) = 998 ))
ORDER  BY catalog.catcost;

This will fetch all the rows of a result set where we are looking for the minimum catalog price of a Red Fire Hydrant Cap and who is the supplier that manufactures this part. The reader will notice that apart from the deficiency of the nested JOINs, (see here), we had to formalize our search in SQL language in order to get back our result. Wouldn’t be nice if we could engage the user in a codeless style of search, independent of the business case. Let us see the difference with the Topic Map data model first.

Associations in Topic Map Data Model

Perhaps there is not a better software tool out there to introduce you to Topic Maps than Wandora information management application, see how.

Tuples to Associations

Our first step is to build a Topic Map data model from the SQL result set above. With Wandora this is easy thanks to its powerful set of extractors. Here we use an Excel adjacency list extractor to convert each spreadsheet row of this Excel file to a Topic MapAssociation.

Tuples of a Relation - Wandora Associations

In the right panel of the screen capture, you may see that we have four associations of typeTuple. They are all sorted by the catcost column. This is the role that cells of this column play in the Tuple association. In our example each Tuple is an instance of the Excel class with a maximum of 8 members and each member plays a role in the association. You may agree that this Topic Map model view of data looks already very familiar to the user that is accustomed to tables.

But behind the scenes, Topic Map associations are notably different from the n-ary tuples of the relational model. In the left panel of our screen capture, you can see all the data that are extracted from the spreadsheet. Notice that no data value is repeated. Each association is constructed from single instance values and this also means that associations are sharing values among them. We can visualize the network of associations by switching to Wandora’s Graph topic panel. From the left panel, we select the minimum price of the part, which is 11.7 and then we expand this node on the Graph topic panel. This way the first association will be drawn that includes as members all the other values that this cell is associated with. One of them is USA and plays the scountry role. We can right-click on the value and expand again the nodes (associated members). Two associations are displayed now on the graph that share four common values between them.

Two Associations in the Graph Topic Panel of Wandora.

Brown labels indicate the type (Tuple) of association and the role (sname) of one of its members

SQL to Topic Map Filtering

Another important observation we should make at this point is that instead of writing any query to fetch those suppliers that are located in USA we simply filtered the table based on this value. We are able to do this because Topic Map data model works with single instance values that are linked bidirectionally. Accordingly, data is always normalized and the main operations of the relational algebra such as set operations, selection ,and projection, can be performed. For instance, filtering associations that have USA member is equivalent to selecting rows from SQL. Moreover, the user can traverse interactively the graph starting from any value without writing a single line of code.

Topic Map Serialization

To understand better the underlying structure of data in the previous example, we have serialized a Topic Map in LTM format. Dropping this LTM file into an empty topic panel, we invoke the import function of Wandora. Then we expand the topic tree and double click on the 998 cell. The following screen capture looks pretty much the same as the one we have generated from extracting the Excel spreadsheet above. The main difference is that now we have two association types, one for Catalogue tuples and another for Part tuples. Part 998participates in five associations (tuples) in total, four of them are from Catalogue table and one from Part table. We have also taken a minimum number of members, i.e. fields (columns), for our associations to keep it simple.


/* 1 Association of catalog part no 998 with "Red" and "Fire Hydrant Cap" */
Prt( prtName08:pname, prtID08:pid, prtColorRed:pcolor )

/* 4 Associations of catalog part no 998 with supplier Ids and catalog prices */
Cat( prtID08:catpid, supPrice18:catcost, supID18:catsid )
Cat( prtID08:catpid, supPrice14:catcost, supID14:catsid )
Cat( prtID08:catpid, supPrice16:catcost, supID16:catsid )
Cat( prtID08:catpid, supPrice12:catcost, supID12:catsid )
Associations of part no. 998

Because of the single instance feature of Topic Maps, If we switch to Wandora’s Graph topic panel we can visualize these associations.

Associations of part no. 998

R3DM Type System in Wandora

We expand our previous example with tuples from three tables and a rich type system also known as Ontology or database schema.

Associations of part no. 998, Supplier Associations and Symmetric Binary Predicates

We start again our exploration from the data item with the value 998. There is one association, labeled Part8, representing a tuple from Part table (998, “Fire Hydrant Cap”, Red, 7.2, lb) and four of them, labeled Cat08, Cat12, Cat13, Cat17, representing tuples from the Catalogue table (see result set above). Further on, USA suppliers are expanded with two more associations, labeled Supplier1 and Supplier2. The two previous graphs are combined into a single one.

We can also observe other associations with only two members, we call these symmetric binary predicates (SBP). SBPs are used to define a prototype for our R3DM type system. For example, the member value 7.2 of associationPart8 is of type Part Weight (Instantiation Predicate) and Part Weight is hyponym ofWeight (HyperHypo Predicate). The reader may compare this graph with this Entity-Relationship diagram and realize that we have both instances (values) and classes, i.e. Entities and Attribues, on the same graphics diagram. Not only that, but most important our type system is generic and we are able to merge and classify data from multiple datasets. Alternatively, Wandora’s Topics tree panel on the left part of the image displays a hierarchical structure for types and instances. Those that are expanded can also be seen on the graph panel.

This is a good place to introduce our R3DM type system. It is presented here for the first time and it will be fully analyzed in a forthcoming post of this series. In the next graph, we have expanded several nodes of our hierarchical type system. For instance, Catalogue,Catalogue Part, Catalog Supplier are represented as groups (Grouping Predicate) of Wandora class which is the Root of the Topic Tree. Each of these groups is also a type (Instantiation Predicate) for a set of instances that represent associations. We have Cat01-16 for Catalogue associations, Part1-9 for Part associations, and Supplier1-4 for Supplier associations.

R3DM Type System in Wandora

We have defined six SBPs in total, these are grouped under the Symmetric Binary Predicates group. Five of them have been configured with special symbols to be used in Wandora’s Topic tree. Each one of them is expanded on Wandora’s graph panel to show the pair of its association members. The Holonomy-Meronomy Predicate has been used to connect association members to association type.

Wandora class has also five group members, we have already seen four of them. The fifth one is a Variable Type group and it has four sub-categories, Categorical Variable Types,Meta Variable Types, Numeric Variable Types and Time Variable Types. Each of these specialized types has instances and each instance can be further specialized. For example, starting from bottom to top :

grisInstanceOfPart UnitisHyponymOfUnitisInstanceOfCategorical Variable TypeisHyponymOfVariable TypeisMemberOfWandora class

In this way, all values are single instances of some type and they are hyperlinked to build associations (hyperedges). Each association is also an instance of some type.

Summary

We conclude our discussion of Association in Topic Map data model. Topic map data model is profound and at the same time powerful and simple to start building higher constructs to model real things. Association in Topic Map is a first class citizen. Here is a list of interesting features we discussed in this post :

  1. We can define single instance Topics (and/or values) and then associate them.
  2. Association is a bidirectional n-ary relation with Type and Roles embedded.
  3. Because of 1. and 2. we can traverse the graph from any point to any other point simply by filtering nodes and edges.
  4. We can display both Topic/Association types and instances on the same graph
  5. Data from multiple data sets cab be merged with R3DM type system that is based on symmetric binary associations.

There are other powerful features of Topic Maps related to associations such as merging and information resources that are attached to topics, i.e. occurrences, we reserve time and space to highlight them in a forthcoming article.

We will continue our demonstration with the Catalogue-Part-Supplier example. This time we will examine how associations can be represented on the Graph Model of OrientDB DBMS.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:
data modeling ,database

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}