Join the DZone community and get the full member experience.Join For Free
git clone firstname.lastname@example.org:karussell/jsii.git
Try it out here: http://pannous.info:8124/select?q=google e.g. filter queries works like id:xy or queries with sorting works like &sort=id asc. The paramters start and rows can be used for paging. For those who come too late e.g. my server crashed or sth. , here is an image of the xml response:
The solr compatible xml response format makes it possible to use jsii from applications that are using SolrJ. For example I tried it for Jetwick and the basic search worked – just specify the xml reponse parser:
My understanding of the basics is now the following:
- The term frequency (tf) is to weight documents differently. E.g. document1 contains ‘java’ 10 times but doc2 has it 20 times. So doc2 is more important for a query ‘java’. If you index tweets you should do tf = min(tf, 3). Otherwise you will often get tweets ala ‘java java java java java java…’ instead of important once. So for tweets a higher entropy is also relevant
- The inverted document frequency (idf) gives certain terms a higher (or lower) weight. So, if a term occurs in all documents the term frequency should be low to make that term of a query not so important compared to other terms where less documents were found
- look into the TODO file before posting an issue
- jsii feeding is NOT thread safe
- I readed this object oriented JS with node and got some suggestions from node.js users
- git cheat sheet
- There is older, similar project called jssindex
Published at DZone with permission of Peter Karussell . See the original article here.
Opinions expressed by DZone contributors are their own.