Oracle RDBMS + Neo4j: Migrate or Sync a Subset of Data
Learn two approaches of getting your data from Oracle RDBMS to Neo4j — syncing or migrating — and the benefits and hazards of each one.
Join the DZone community and get the full member experience.Join For Free
In this Neo4j and Oracle blog series, we’ll explore how these two database technologies work together in tandem to deliver the best bottom-line results for both enterprise architects and business teams alike. In previous weeks, we defined and introduced both Neo4j and Oracle RDBMS, and we covered three advantages of using Neo4j with Oracle.
This week, we’ll discuss the advantages of migrating or syncing a subset of your data from Oracle RDBMS to Neo4j and two different approaches you can take.
Approach 1: Migrate Relevant Data to Neo4jIf there are questions your current application can’t efficiently answer because of the depth and complexity of data relationships, you might choose to migrate relevant data to Neo4j, where Neo4j is the transactional ACID-compliant data store for that portion of the data.
For example, while the customer and product information can reside in Oracle, the identifiers for customers and products and the way they relate to each other can reside in Neo4j. The calling application can ask Neo4j for relationship information and then use the result set to get details from Oracle.
Approach 2: Sync a Portion of the Data With Neo4jAnother approach is to synchronize a portion of the data from the RDBMS to Neo4j. This was the case for Norwegian telecom Telenor. Telenor’s self-service web application was facing a key challenge: login times were increasing, which would ultimately result in losing customers. Logins needed to calculate resource authorization for each user, which, especially for large customers with many users, takes a long time.
At first, Telenor chose to use a stored procedure to pre-calculate resource authorizations for its largest customers in a nightly batch job and then cache the results. This would speed up login for the largest customers; for smaller customers, resource authorizations would be calculated on the fly.
The approach had some drawbacks: the cached data could be up to 24 hours old, so any changes (such as adding new users or deleting terminated employees) could take as much as a day to kick in, and it did nothing to solve the problem for smaller customers. Further looking ahead, Telenor could see that the pre-calculation approach would not ultimately scale as growth continued. The stored procedure alone was extremely complex, with about 1500 lines of SQL code.
Telenor looked to a NoSQL approach to augment their current RDBMS. They decided to sync their data with Solr/Lucene for search capabilities and their resource authorization data, which is highly connected, with Neo4j. Using Neo4j, Telenor could perform resource authorization in real time rather than pre-calculating it, providing users with millisecond-level login times, with changes reflected in near real-time.
Syncing data with Neo4j is handled through middleware such as Oracle GoldenGate and related adapters, including those from Oracle as well as open source adapters such as the Oracle GoldenGate and Apache Kafka adapter created by Monsanto and available on GitHub.
Migrating data into Neo4j is handled through CSV files. You export tables out of Oracle and then import them into Neo4j as CSV files using either a command-line tool or syntax inside the Cypher graph query language. This can be performed manually or you can use an ETL tool to automate the migration.
ConclusionIn the next few weeks, we’ll look at two other ways to use Neo4j alongside Oracle RDBMS, including a full sync between the two and a full data migration from one to the other.
Published at DZone with permission of Gabe Stanek, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.