Acquire Content and Feed it to Search Technologies with ManifoldCF
Join the DZone community and get the full member experience.
Join For FreeManifoldCF
is an interesting project currently warming up in Apache's incubation
stage, and it was the subject of a presentation at ApacheCon NA 2011.
Don't
know what ManifoldCF is? Think of search technologies like Solr. Now
imagine you're working at the enterprise level and have multiple content
repositories, all containing a ton of data, and you need to feed the
data from those repositories into the search technology.
Enter
ManifoldCF. ManifoldCF is a connector. It'll integrate with your
content repositories, your search engine indexes, and even your
authentication provider so that users only see results for documents
that they have access to. With its plug-in style architecture it
offers functionality for numerous commercial and open source data
sources (e.g. Documentum and SharePoint). If your search technology or
content repositories aren't currently supported, you can design your
own custom connectors. For more on this and pretty much everything else
about ManifoldCF, you might want to check out ManifoldCF in Action. If you're only interested in the quick and dirty, chapter one is available for free.
But
I digress. Where were we? Ah, right, ManifoldCF at ApacheCon. Karl
Wright, the project's founder, one of its principal committers and the author of the aforementioned ManifoldCF in Action, was
on hand to explain how ManifoldCF connects source content repositories
to target repositories:
I'll introduce ManifoldCF, and describe the general enterprise content acquisition and indexing problem which led to its development. I will discuss accessing multiple repositories, enforcing repository security, and incrementally keeping indexes up to date. I'll give an overview of its architecture, and demonstate simple crawls and a secure integration with Apache Solr.
If by now you've worked up an appetite for more, why not listen to the entire presentation?
Opinions expressed by DZone contributors are their own.
Comments