Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Automata Invasion: Finite-State Technology in Lucene

DZone's Guide to

Automata Invasion: Finite-State Technology in Lucene

· Java Zone
Free Resource

Build vs Buy a Data Quality Solution: Which is Best for You? Gain insights on a hybrid approach. Download white paper now!




Here's another great presentation from the just-finished Lucene Revolution 2012 with Robert Muir of Lucid Imagination and Michael Mccandless (a DZone MVB) from IBM.

Finite-state technology, including automata and weighted finite state transducers (wFSTs), are compact data structures well suited to text processing and searching applications. Low level support for both automata and wFSTs is now available in Lucene and has recently enabled a number of surprisingly powerful improvements. In this joint talk, Robert Muir and Michael McCandless will provide an overview of finite-state technology and then describe how it's used today in Lucene: synonym filtering, fuzzy queries, respelling/suggesting, terms dictionary, in-memory postings format (MemoryPostingsFormat) and Japanese analysis (Kuromoji analyzer).

Download session slide

 

Build vs Buy a Data Quality Solution: Which is Best for You? Maintaining high quality data is essential for operational efficiency, meaningful analytics and good long-term customer relationships. But, when dealing with multiple sources of data, data quality becomes complex, so you need to know when you should build a custom data quality tools effort over canned solutions. Download our whitepaper for more insights into a hybrid approach.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}