LinkedIn Open Sources IndexTank - A Search System Kinda Like Solr
LinkedIn promised to open source the technology after they acquired the technology from another company.
- IndexEngine: a real-time fulltext search-and-indexing system designed to separate relevance signals from document text. This is because the life cycle of these signals is different from the text itself, especially in the context of user-generated social inputs (shares, likes, +1, RTs).
- API: a RESTful interface that handles authentication, validation, and communication with the IndexEngine(s). It allows users of IndexTank to access the service from different technology platforms (Java, Python, .NET, Ruby and PHP clients are already developed) via HTTP.
- Nebulizer: a multitenant framework to host and manage an unlimited number of indexes running over a layer of Infrastructure-as-a-Service. This component of IndexTank will instantiate new virtual instances as needed, move indexes as they need more resources, and try to be reasonably efficient about it.
--Diego Basch, LinkedIn Director of Engineering
And how, you ask, is IndexTank different from Lucene and Solr? An old FAQ describes, but I'm pretty sure the more recent versions of Solr have these features now too. Solr definitely has Geospatial searching:
There are many functional differences. For example, documents added to an IndexTank index are immediately searchable; document variables can be updated without having to re-index the whole document and they can be updated very quickly and at a very rapid pace without affecting the index performance; results can be sorted by arbitrary functions that include geolocation support. --IndexTank FAQ
So go try it out if you're interested. IndexTank was previously provided as a paid SaaS, but those services (which were actually used by Reddit) were shut down late in October because of the impending open sourcing.