Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Multi-lingual search with Lucene and Elasticsearch

DZone's Guide to

Multi-lingual search with Lucene and Elasticsearch

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Last night I gave a talk at SkillsMatter London on multi-lingual search with Lucene and Elasticsearch. The talk covered various challenges with indexing texts in various languages: tokenization, term normalization and stemming. I started with demonstrating the challenges on individual languages, and ended with discussing the ability of mixing texts in various languages in one index - whether it is at all possible, and how to approach that.

We had some issues with the recording so I had to repeat the first few slides (this is why I go very quick in the first minutes...) and the audio quality could be better, nevertheless the talk presents real-world issues and offers what I believe to be good paths for solving those issues. Since this is quite a lot to write blog posts about I think I will just leave it in its video existence for now.

The video is available here: https://skillsmatter.com/skillscasts/4968-approaches-to-multi-lingual-text-search-with-elasticsearch-and-lucen.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:

Published at DZone with permission of Itamar Syn-hershko, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}