Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (October 17 - October 24). Here they are, in order of popularity:
Apache Hadoop has slowly been infiltrating the mainstream business world, but many executives are still left with doubts about whether adopting Hadoop is a sound strategy for their organization. Is Hadoop enterprise friendly? Is it economical for an organization to use?
A sequal of what was implemented in Part 1 of this tutorial; we continue indexing and improving search conditions through different features provided by the Apache Lucene library.
This tutorial will explain the Lucene and Tika frameworks will be explained through their core concepts (parsing, mime detection, indexing, scoring, boosting) via illustrative examples that should be applicable to not only seasoned software developers but to beginners to content analysis and programming as well.
This is a sequal of what was presented in part 1 and part 2 of this tutorial; after indexing and querying we can highlight the results of a search by making use of Highlighter(s).
Do you remember that time when you spent a whole day trying to fix a problem, only to realize that you have mistyped a configuration setting? Avoiding that is not trivial, as not only you, but also the frameworks that you use should take care. But let me outline my suggestion.