Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Solr v2 API – Quick Look

DZone's Guide to

Solr v2 API – Quick Look

Solr 6.5's new API gets away from Solr's long-used original API. In this post we take a look at how it differs, what it means to you, and how to make use of it.

· Integration Zone
Free Resource

Modernize your application architectures with microservices and APIs with best practices from this free virtual summit series. Brought to you in partnership with CA Technologies.

We are all used to the Solr API that has been present in Solr from its beginnings. We send the data using HTTP protocol, we include all parameters in the URL itself, and we are bound to that. Some people loved this, some not so much.  Starting with Solr 6.5, we now have a new, self-documenting API called v2. Let’s look at this new API, how to use it and how it is different from the old fashioned Solr API.

Introducing the New Solr API

Let’s just immediately start working with the new API.  It’s probably the best way to learn about it.  Here’s the most basic request we can execute against the new Solr API:

$ curl http://localhost:8983/v2

The first thing you’ll notice is that the new API is not available under the usual Solr context – there is no /solr in the URL. Instead, we talk to it using the /v2 URI path. This lets Solr have two separate sets of APIs in the same instance of Solr and have a space for new APIs introduced in the future. The response of the above call looks as follows:

{"responseHeader":{"status":0,"QTime":0},"collections":["gettingstarted"]} 

As we can see, the new API returns the same old standard response header and the list of collections that are present in the cluster. The call to the old API to get this same info looks like this:

$ curl 'http://localhost:8983/solr/admin/collections?action=LIST'

This time, the response is returned in the XML, but the information is the same:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">0</int></lst><arr name="collections"><str>gettingstarted</str></arr>
</response>

Of course, in both cases we can pretty-print the results by adding indent=true to the request, like this:

$ curl 'http://localhost:8983/v2?indent=true'
{
  "responseHeader":{
    "status":0,
    "QTime":0},
  "collections":["gettingstarted"]}

We can also change the response type when using the old API, so that the returned response is very similar:

$ curl 'http://localhost:8983/solr/admin/collections?action=LIST&wt=json&indent=true'
 {
   "responseHeader":{
     "status":0,
     "QTime":0},
   "collections":["gettingstarted"]}

So, Why is That Different?

First things first – the new API is self-documenting. That means that we can get the list of information and options we have when using the new API. By adding the _introspect endpoint to any API v2 calls we can get the list of possible operations using that endpoint. For example:

$ curl 'http://localhost:8983/v2/collections/_introspect?indent=true'

Or even better, we can use c instead of collections to shorten the call to look as follows:

$ curl 'http://localhost:8983/v2/c/_introspect?indent=true'

The response returned by Solr is rather large, so we’ll just show a portion of that, but you can see that the API contains not only the response with the data we are looking for, but also some additional descriptions which make the API self-documenting:

{
  "responseHeader":{
    "status":0,
    "QTime":2},
  "spec":[{
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api6",
      "description":"Deletes a collection.",
      "methods":["DELETE"],
      "url":{"paths":["/collections/{collection}",
          "/c/{collection}"]}},
    {
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1",
      "description":"Create collections and collection aliases, backup or restore collections, and delete collections and aliases.",
      "methods":["POST"],
      "url":{"paths":["/collections",
          "/c"]},
.
.
.

As you can already tell, the new v2 API is more modern and most of the parameters are sent in the request body, instead of the URI. Once the new v2 API covers all the functionality of the old API, SolrJ and Solr admin will start using the new API and after that, it is expected that the old API will be deprecated and then removed. Because of that, it might be a good to start getting used to the new API right away, so you have an easier learning curve and faster adoption when you finally decide to move to the new way of talking to Solr.

V2 Solr API Capabilities

The response returned by the commands that we’ve seen above is large, so I encourage you to check the response yourself. What I would like to do is provide you with a brief description on what can be done using the v2 API:

  • Creating, deleting and managing collections.
  • Creating aliases, backing up, and restoring collections.
  • Sending data.
  • Updating collection configuration.
  • Managing schema and managed resources.
  • Using request handlers – for example, running search requests.
  • Adding and removing replicas.
  • Managing cores.
  • Performing overseer operations.
  • Managing node roles.
  • Setting cluster properties.
  • Uploading and downloading blobs and metadata.

As you can see, we can already do lots of things with the new API, and because the API is self-documenting we can quickly, without searching for the documentation, see how to work with it. For example, if we wanted to see what we can do with shards, we could run a command like this (we’ll use one of the out-of-the-box collections that come with Solr called gettingstarted):

$ curl 'localhost:8983/v2/c/gettingstarted/shards/_introspect?indent=true'

The response shows us what we can do with “/shards” API:

{
  "spec":[{
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api7",
      "description":"Deletes a shard by unloading all replicas of the shard, removing it from clusterstate.json, and by default deleting the instanceDir and dataDir. Only inactive shards or those which have no range for custom sharding will be deleted.",
      "methods":["DELETE"],
      "url":{
        "paths":["/collections/{collection}/shards/{shard}",
          "/c/{collection}/shards/{shard}"],
        "params":{
          "deleteInstanceDir":{
            "type":"boolean",
            "description":"By default Solr will delete the entire instanceDir of each replica that is deleted. Set this to false to prevent the instance directory from being deleted."},
          "deleteDataDir":{
            "type":"boolean",
            "description":"y default Solr will delete the dataDir of each replica that is deleted. Set this to false to prevent the data directory from being deleted."},
          "async":{
            "type":"string",
            "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined. This command can be long-running, so running it asynchronously is recommended."}}}},
    {
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API",
      "description":"Allows you to create a shard, split an existing shard or add a new replica.",
      "methods":["POST"],
      "url":{"paths":["/collections/{collection}/shards",
          "/c/{collection}/shards"]},
      "commands":{
        "split":{
          "type":"object",
          "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3",
          "description":"Splits an existing shard into two or more new shards. During this action, the existing shard will continue to contain the original data, but new data will be routed to the new shards once the split is complete. New shards will have as many replicas as the existing shards. A soft commit will be done automatically. An explicit commit request is not required because the index is automatically saved to disk during the split operation. New shards will use the original shard name as the basis for their names, adding an underscore and a number to differentiate the new shard. For example, 'shard1' would become 'shard1_0' and 'shard1_1'. Note that this operation can take a long time to complete.",
          "properties":{
            "shard":{
              "type":"string",
              "description":"The name of the shard to be split."},
            "ranges":{
              "description":"A comma-separated list of hexadecimal hash ranges that will be used to split the shard into new shards containing each defined range, e.g. ranges=0-1f4,1f5-3e8,3e9-5dc. This is the only option that allows splitting a single shard into more than 2 additional shards. If neither this parameter nor splitKey are defined, the shard will be split into two equal new shards.",
              "type":"string"},
            "splitKey":{
              "description":"A route key to use for splitting the index. If this is defined, the shard parameter is not required because the route key will identify the correct shard. A route key that spans more than a single shard is not supported. If neither this parameter nor ranges are defined, the shard will be split into two equal new shards.",
              "type":"string"},
            "coreProperties":{
              "type":"object",
              "documentation":"https://cwiki.apache.org/confluence/display/solr/Defining+core.properties",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set, the node name, the data directory, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined. This command can be long-running, so running it asynchronously is recommended."}}},
        "create":{
          "type":"object",
          "properties":{
            "nodeSet":{
              "description":"Defines nodes to spread the new collection across. If not provided, the collection will be spread across all live Solr nodes. The names to use are the 'node_name', which can be found by a request to the cluster/nodes endpoint.",
              "type":"array",
              "items":{"type":"string"}},
            "shard":{
              "description":"The name of the shard to be created.",
              "type":"string"},
            "coreProperties":{
              "type":"object",
              "documentation":"https://cwiki.apache.org/confluence/display/solr/Defining+core.properties",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set, the node name, the data directory, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined."}},
          "required":["shard"]},
        "add-replica":{
          "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica",
          "description":"",
          "type":"object",
          "properties":{
            "shard":{
              "type":"string",
              "description":"The name of the shard in which this replica should be created. If this parameter is not specified, then '_route_' must be defined."},
            "_route_":{
              "type":"string",
              "description":"If the exact shard name is not known, users may pass the _route_ value and the system would identify the name of the shard. Ignored if the shard param is also specified. If the 'shard' parameter is also defined, this parameter will be ignored."},
            "node":{
              "type":"string",
              "description":"The name of the node where the replica should be created."},
            "instanceDir":{
              "type":"string",
              "description":"An optional custom instanceDir for this replica."},
            "dataDir":{
              "type":"string",
              "description":"An optional custom directory used to store index data for this replica."},
            "coreProperties":{
              "type":"object",
              "documentation":"https://cwiki.apache.org/confluence/display/solr/Defining+core.properties",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set and the node name, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined."}},
          "required":["shard"]}}},
    {
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1",
      "description":"Lists all collections, with details on shards and replicas in each collection.",
      "methods":["GET"],
      "url":{"paths":["/collections/{collection}",
          "/c/{collection}",
          "/collections/{collection}/shards",
          "/c/{collection}/shards",
          "/collections/{collection}/shards/{shard}",
          "/c/{collection}/shards/{shard}",
          "/collections/{collection}/shards/{shard}/{replica}",
          "/c/{collection}/shards/{shard}/{replica}"]}}],
  "WARNING":"This response format is experimental.  It is likely to change in the future.",
  "WARNING":"This response format is experimental.  It is likely to change in the future.",
  "WARNING":"This response format is experimental.  It is likely to change in the future.",
  "availableSubPaths":{
    "/c/gettingstarted/shards/{shard}/{replica}":["DELETE",
      "GET"],
    "/c/gettingstarted/shards/{shard}":["DELETE",
      "POST",
      "GET"]}}

As you can see, the API provides us all the information about itself that we need – the HTTP verbs that we can use, the parameters that can be present, and finally their description, so that we know what each parameter is all about. We can also get information about the given command and/or the HTTP verb, for example:

$ curl 'http://localhost:8983/v2/c/gettingstarted/shards/shard2/_introspect?method=DELETE&indent=true'

Judging from the response further above, we could, for example, delete a replica by running the following command:

$ curl -XDELETE 'localhost:8983/v2/c/gettingstarted/shards/shard2/core_node3'

The response to the last command would look as follows:

{"responseHeader":{"status":0,"QTime":278},"success":{"192.168.1.15:7574_solr":{"responseHeader":{"status":0,"QTime":69}}}}

This means that the replica for the shard2 has been removed, which can also be checked via the Solr admin panel:

Solr V2 - Solr admin panel$ curl -XPOST 'localhost:8983/v2/c/gettingstarted/shards/' -H 'Content-type:application/json' -d '{
"add-replica" : {
"shard" : "shard2",
"node" : "192.168.1.15:7574_solr"
}
}'

We added the header identifying the content type of the body and we provided the add-replica command along with two parameters – shard and node. The shard parameter specifies which part of the collection we are interested in, and the node property tells Solr on which Solr instance the replica should be created. Please note that the node address is not only the IP address but also includes the port and usual _solr part.

The response would look as follows:

{"responseHeader":{"status":0,"QTime":1329},"success":{"192.168.1.15:7574_solr":{"responseHeader":{"status":0,"QTime":1318},"core":"gettingstarted_shard2_replica2"}}}

And would result in a new replica being added:

Solr V2 The API we just introduced is still a work in progress. We are still missing a few things, but the V2 API is fairly new, so we can expect lots of changes in the next few Solr versions.

The Integration Zone is proudly sponsored by CA Technologies. Learn from expert microservices and API presentations at the Modernizing Application Architectures Virtual Summit Series.

Topics:
solr ,api ,integration ,solr v2

Published at DZone with permission of Rafal Kuc. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}