Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Getting Started With Apache Solr

DZone's Guide to

Getting Started With Apache Solr

Apache Solr is an open-source search server. Learn how to install and start Apache Solr. Also learn some useful commands for starting Apache Solr.

· Big Data Zone
Free Resource

See how the beta release of Kubernetes on DC/OS 1.10 delivers the most robust platform for building & operating data-intensive, containerized apps. Register now for tech preview.

Apache Solr is an open-source search server. Apache Solr includes the full-text search engine Apache Lucene. Solr is an HTTP wrapper around an inverted index provided by Lucene. The purpose of the inverted index is to allow the fast full-text search, at a cost of increased processing when a document is added to the database. The inverted file may be the database file itself, rather than its index. It is the most popular data structure used in retrieving document systems used on a large scale, like in a search engine.

Now that you have a little bit of an idea what Apache Solr does, let's download it and start working on it. You can download the latest version from here.

It’s easy to install and start Apache Solr. Just follow these steps and we are good to go.

  1. Download Apache Solr.
  2. Extract to the desired location.
  3. Change directory to Apache Solr.
  4. Type ./bin/solr start -e cloud -noprompt. 
  5. To stop Apache Solr, type ./bin/solr stop -all.

Once starting Apache Solr, you can go to http://localhost:8983/solr/ to see the Solr Admin panel. If you wish to change the port from 8983 to something else, you can use the -p option (i.e. ./bin/solr start -p 4444). When you start Apache Solr for the first time, there will be no data to play with or query on. You need to feed some data to Apache Solr with ./bin/post -c gettingstarted examples/exampledocs/*.xml. These are example XML documents that getting ingested to Solr.

Now, let’s see some options for starting Apache Solr:

  • -a to add JVM options:
bin/solr start -a "-Xdebug -Xrunjdwp:transport=dt_socket, server=y,suspend=n,address=1044"
  • -c to start Solr in SolrCloud mode, which will also launch the embedded ZooKeeper instance included with Solr.
  • -d to define server directory.
  • -e to run configurations like cloud, techproducts, dih, and schemaless.
  • -f to run in the foreground.
  • -noprompt to start Solr and to suppress any prompts that may be seen with another option. This has a side effect of accepting all defaults implicitly.

These are the most useful options to start Apache Solr with you can find more options here.

After you have installed and started the Apache Solr, you can add some data. For inserting data, you can use bin/post -c collection_namepath_to_data.

The default collection is gettingstarted. You can also create your own collection with bin/solr start -e cloud.

Now that you have Solr ready and the data has been inserted, you can play around with querying data from the UI at localhost:8983/solr. Then, select the collection from the collection list and click on the query section. You can find more details about querying here.

New Mesosphere DC/OS 1.10: Production-proven reliability, security & scalability for fast-data, modern apps. Register now for a live demo.

Topics:
big data ,tutorial ,apache solr

Published at DZone with permission of Akash Sethi, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}