Join the DZone community and get the full member experience.Join For Free
Read this guide to learn everything you need to know about RPA, and how it can help you manage and automate your processes.
git clone firstname.lastname@example.org:karussell/jsii.git
Try it out here: http://pannous.info:8124/select?q=google e.g. filter queries works like id:xy or queries with sorting works like &sort=id asc. The paramters start and rows can be used for paging. For those who come too late e.g. my server crashed or sth. , here is an image of the xml response:
The solr compatible xml response format makes it possible to use jsii from applications that are using SolrJ. For example I tried it for Jetwick and the basic search worked – just specify the xml reponse parser:
My understanding of the basics is now the following:
- The term frequency (tf) is to weight documents differently. E.g. document1 contains ‘java’ 10 times but doc2 has it 20 times. So doc2 is more important for a query ‘java’. If you index tweets you should do tf = min(tf, 3). Otherwise you will often get tweets ala ‘java java java java java java…’ instead of important once. So for tweets a higher entropy is also relevant
- The inverted document frequency (idf) gives certain terms a higher (or lower) weight. So, if a term occurs in all documents the term frequency should be low to make that term of a query not so important compared to other terms where less documents were found
- look into the TODO file before posting an issue
- jsii feeding is NOT thread safe
- I readed this object oriented JS with node and got some suggestions from node.js users
- git cheat sheet
- There is older, similar project called jssindex
Published at DZone with permission of Peter Karussell . See the original article here.
Opinions expressed by DZone contributors are their own.