Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Apache Tika: 1 point Oh!

DZone's Guide to

Apache Tika: 1 point Oh!

· Java Zone ·
Free Resource

Build vs Buy a Data Quality Solution: Which is Best for You? Gain insights on a hybrid approach. Download white paper now!

Apache Tika's all grown up!  A fledgling sub-project of Lucene for two years after emerging from the incubator in 2008, Tika is spreading its wings and soaring as an ASF top level project and a leading text extraction library and content detection framework.  This celebratory tone exemplifies the presentation given at ApacheCon NA 2011 by Chris Mattmann, senior computer scientist at the NASA Jet Propulsion Laboratory and adjunct assistant professor at the University of Southern California.  

Mattmann lists the following phenomena as proof that Tika has officially reached a point where it deserves to be referred to as a "mature community":

In November, we hope to have released Tika 1.0. This will coincide with a number of other properties that demonstrate Tika has reached the point of a mature community, including:

1. Concrete, stable features, and core interfaces.
2. Tika's use in multiple programming languages and environments.
3. Our growth in Apache, and election of new committers and PMC members (and ASF members).
4. Developer articles appearing quite frequently on Tika.
5. The culmination of a wealth of knowledge in the form of a book that will be published on Tika at the time of the ApacheCon meeting.


I can't say for sure whether the book has been published, but Tika 1.0 was indeed released on November 7, just in time for ApacheCon, so I suppose congratulations are in order for Tika's reaching yet another goal.

To learn more about how Tika has achieved its success and what comes next for the community, give Mattmann's presentation a listen!

Build vs Buy a Data Quality Solution: Which is Best for You? Maintaining high quality data is essential for operational efficiency, meaningful analytics and good long-term customer relationships. But, when dealing with multiple sources of data, data quality becomes complex, so you need to know when you should build a custom data quality tools effort over canned solutions. Download our whitepaper for more insights into a hybrid approach.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}