Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Apr. 25 to May 01). Here they are, in order of popularity:
The overwhelming quantity of data on the horizon is a big issue alone, but don't forget about new data types, security, and emergent processing technology.
Apache Lucene and Solr PMC announced another version of Apache Lucene library and Apache Solr search server numbered 4.8. This is a next release continuing the 4th version of both Apache Lucene and Apache Solr. This is also the first version of Lucene that requires JDK 7.
After spending a day with yet another Heisenbug which seemed to change its shape whenever I got close to the cause, I thought my lessons learned from the case could be worth sharing.
Hadoop is a fairly tough network problem to solve if you want to do anything more than “throw bandwidth at the problem”. And when you do throw bandwidth at the problem, the extreme burstiness of the traffic will still significantly drag down the performance of the overall solution.
We all know how good it is to have abstraction layers in software we create. Why not do the same with search queries? Can we even do that in Elasticsearch and Solr?