Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Using Solr 4.9 new ChildDocTransformerFactory

DZone's Guide to

Using Solr 4.9 new ChildDocTransformerFactory

· Java Zone
Free Resource

Try Okta to add social login, MFA, and OpenID Connect support to your Java app in minutes. Create a free developer account today and never build auth again.

Lucene & Solr 4.9 were released a couple weeks ago and introduced a new result document transformer called ChildDocTransformerFactory.

The ChildDocTransformerFactory transformer is useful when we need to get child documents that were indexed as nested documents.

There are many use cases where we want to 'join' results into a single response where this transformer can help.

For instance, lets say we have the following nested documents:
<doc>
	<field name="id">1</field>
	<field name="name">I am the parent</field>
	<field name="cat">PARENT</field>
	<doc>
		<field name="id">1.1</field>
		<field name="name">I am the 1st child</field>
		<field name="cat">CHILD</field>
	</doc>
	<doc>
		<field name="id">1.2</field>
		<field name="name">I am the 2nd child</field>
		<field name="cat">CHILD</field>
		<doc>
			<field name="id">1.2.1</field>
			<field name="name">I am a grandchildren</field>
			<field name="cat">GRANDCHILD</field>
		</doc>
	</doc>
</doc>

Now, we can use the below BlockJoinQuery to find the parent of all the documents containing the words I am child:

q={!parent which="cat:PARENT"}name:(I am +child)

The result of the above query will contain the parent document and will look like that:
{
  "responseHeader": {
    "status": 0,
    "QTime": 1,
    "params": {
      "indent": "true",
      "q": "{!parent which=\"cat:PARENT\"}name:(I am +child)",
      "_": "1403793883163",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 1,
    "start": 0,
    "docs": [
      {
        "id": "1",
        "name": "I am the parent",
        "cat": [
          "PARENT"
        ],
        "_version_": 1471982731634147300
      }
    ]
  }
}


Now, lets get the parent document along with the other child document but without the grandchild.
We do that by adding a fields parameter that looks like:
fl=id,name,[child parentFilter=cat:PARENT childFilter=cat:CHILD]

The above contains regular fields request - id and name and additional ChildDocTransformer that enable Solr to return the given parent and it's nested child's.
Pay attention that we've added optional parameter - childFilter=cat:CHILD that filters the GRANDCHILDREN out of the response.

The output after adding the above 'fl' is (XML):
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">1</int>
  <lst name="params">
    <str name="fl">id,name,[child parentFilter=cat:PARENT childFilter=cat:CHILD]</str>
    <str name="indent">true</str>
    <str name="q">{!parent which="cat:PARENT"}name:(I am +child)</str>
    <str name="_">1403794541806</str>
    <str name="wt">xml</str>
  </lst>
</lst>
<result name="response" numFound="1" start="0">
  <doc>
    <str name="id">1</str>
    <str name="name">I am the parent</str>
    <doc>
      <str name="id">1.1</str>
      <str name="name">I am the 1st child</str>
      <arr name="cat">
        <str>CHILD</str>
      </arr></doc>
    <doc>
      <str name="id">1.2</str>
      <str name="name">I am the 2nd child</str>
      <arr name="cat">
        <str>CHILD</str>
      </arr></doc>
</doc>
</result>
</response>


Build and launch faster with Okta’s user management API. Register today for the free forever developer edition!

Topics:

Published at DZone with permission of Tomer Levi. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}