Over a million developers have joined DZone.

How to keep your content repository and Solr in synch using Camel

DZone's Guide to

How to keep your content repository and Solr in synch using Camel

· Integration Zone
Free Resource

Migrating from On-Prem to Cloud Middleware? Here’s what Aberdeen Group says leading companies should be considering. Brought to you in partnershp with Liaison Technologies

With recent contributions to Camel, now camel-jcr component has a consumer which allows monitoring a Java Content Repository for changes. If your jcr supports OPTION_OBSERVATION_SUPPORTED then the consumer will register an EventListener and get notified for all kind of events. The chances are that you are not interested in all the events from the whole repository and in this case it is possible to narrow down the notifications to receive by further specifying the path of interest, event types, node uuids, nodeTypes and etc.

How can this consumer be useful? (hhmmm, you tell me) Lets say we have a CMS and we want to keep our external Solr index in synch with the content updates.So whenever a new node is added to the content repository all of its properties get indexed in Solr, and if the node is deleted from the content repository then corresponding document is removed from Solr.

Here is a Camel route that will listen for changes under /path/site folder and all its children. But this route will get notified only for two kind of events: NODE_ADDED and NODE_REMOVED, because the value of eventTypes option is a bit mask of the event types of interest (in this case 3 for masking 1 and 2 respectively).


   .when(script("beanshell", "request.getBody().getType() == 1"))

   .when(script("beanshell", "request.getBody().getType() == 2"))

   .log("Event type not recognized" + body().toString());

Then the route will split each event into a separate message and depending on the event type will send the node creation events to direct:index route and node deletion events to direct:delete route.

Delete route is a simple one: It sets the solr operation to delete_by_id in the message header
and the node identifier into message body which in our case represents also the uniqueKey in the solr schema. Followed by a solr commit.

   .setHeader(SolrConstants.OPERATION, constant(SolrConstants.OPERATION_DELETE_BY_ID))
   .setBody(script("beanshell", "request.getBody().getIdentifier()"))

   .log("Deleting node with id: ${body}")

   .setHeader("SolrOperation", constant("COMMIT"))

Indexing part consist of two routes, where the nodeRetriever route is actually getting the node from content repository using its identifier from the update event:

   .setHeader(JcrConstants.JCR_OPERATION, constant(JcrConstants.JCR_GET_BY_ID))
   .setBody(script("beanshell", "request.getBody().getIdentifier()"))

   .log("Reading node with id: ${body}")

After the node is retrieved from the repository using content enricher EIP, there is also a processor to extract node properties and set them into Camel message properties so that they get indexed as solr document fields.

   .enrich("direct:nodeRetriever", nodeEnricher)

   .log("Indexing node with id: ${body}")
   .setHeader("SolrOperation", constant("INSERT"))

You can find the complete working example on github. In case your CMS is not a JCR, but CMIS compliant, have a look at this cmis component on my github account.

Is iPaaS solving the right problems? Not knowing the fundamental difference between iPaaS and iPaaS+ could cost you down the road. Brought to you in partnership with Liaison Technologies.


Published at DZone with permission of Bilgin Ibryam, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}