How to keep your content repository and Solr in synch using Camel
Join the DZone community and get the full member experience.
Join For Free
How can this consumer be useful? (hhmmm, you tell me) Lets say we
have a CMS and we want to keep our external Solr index in synch with the
content updates.So whenever a new node is added to the content
repository all of its properties get indexed in Solr, and if the node is
deleted from the content repository then corresponding document is
removed from Solr.
Here is a Camel route that will listen for changes under /path/site
folder and all its children. But this route will get notified only for
two kind of events: NODE_ADDED and NODE_REMOVED, because the value of eventTypes option is a bit mask of the event types of interest (in this case 3 for masking 1 and 2 respectively).
from("jcr://username:password@repository/path/site?deep=true&eventTypes=3") .split(body()) .choice() .when(script("beanshell", "request.getBody().getType() == 1")) .to("direct:index") .when(script("beanshell", "request.getBody().getType() == 2")) .to("direct:delete") .otherwise() .log("Event type not recognized" + body().toString());
Then the route will split each event into a separate message and
depending on the event type will send the node creation events to direct:index route and node deletion events to direct:delete route.
Delete route is a simple one: It sets the solr operation to delete_by_id in the message header
and the node identifier into message body which in our case represents also the uniqueKey in the solr schema. Followed by a solr commit.
from("direct:delete") .setHeader(SolrConstants.OPERATION, constant(SolrConstants.OPERATION_DELETE_BY_ID)) .setBody(script("beanshell", "request.getBody().getIdentifier()")) .log("Deleting node with id: ${body}") .to(SOLR_URL) .setHeader("SolrOperation", constant("COMMIT")) .to(SOLR_URL);
Indexing part consist of two routes, where the nodeRetriever route is
actually getting the node from content repository using its identifier
from the update event:
from("direct:nodeRetriever") .setHeader(JcrConstants.JCR_OPERATION, constant(JcrConstants.JCR_GET_BY_ID)) .setBody(script("beanshell", "request.getBody().getIdentifier()")) .log("Reading node with id: ${body}") .to("jcr://admin:admin@repository");
After the node is retrieved from the repository using content enricher EIP, there is also a processor to extract node properties and set them into Camel message properties so that they get indexed as solr document fields.
from("direct:index") .enrich("direct:nodeRetriever", nodeEnricher) .process(jcrSolrPropertyMapper) .log("Indexing node with id: ${body}") .setHeader("SolrOperation", constant("INSERT")) .to(SOLR_URL);
You can find the complete working example on github. In case your CMS is not a JCR, but CMIS compliant, have a look at this cmis component on my github account.
Published at DZone with permission of Bilgin Ibryam, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Hyperion Essbase Technical Functionality
-
A Complete Guide to AWS File Handling and How It Is Revolutionizing Cloud Storage
-
Guide To Selecting the Right GitOps Tool - Argo CD or Flux CD
-
How To Backup and Restore a PostgreSQL Database
Comments