DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Curious about the future of data-driven systems? Join our Data Engineering roundtable and learn how to build scalable data platforms.

Data Engineering: The industry has come a long way from organizing unstructured data to adopting today's modern data pipelines. See how.

Threat Detection: Learn core practices for managing security risks and vulnerabilities in your organization — don't regret those threats!

Managing API integrations: Assess your use case and needs — plus learn patterns for the design, build, and maintenance of your integrations.

Avatar

Munish Gogna

developer at self

Singapore, SG

Joined Apr 2010

About

Munish Gogna has been a software developer since 2006. His languages and technologies include: Java, J2EE, SQL, Spring, Hibernate as well as basic shell scripting (Unix flavors). He is currently working in Singapore with a US bank.

Stats

Reputation: 164
Pageviews: 129.2K
Articles: 2
Comments: 11
  • Articles
  • Comments

Articles

article thumbnail
Using Sphinx and Java to Implement Free Text Search
As promised I am going to provide an article on how we can use Sphinx with Java to perform a full text search. I will begin the article with an introduction to Sphinx. Introduction to Sphinx Databases are continually growing and sometimes tend to hold about 100M records and need an external solution for full text search to be performed. I have picked Sphinx, an open source full-text search engine, distributed under GPL version 2 to perform a full text search on such a huge amount of data. Generally, it's a standalone search engine meant to provide fast, size-efficient and relevant full-text search functions to other applications very much compatible with an SQL Database. So my example will be based on the MySQL database, as we cannot produce millions of data to evaluate the real power of Sphinx, we will have a small amount of data and I think that should not be a problem. Here are few Sphinx Unique Features: high indexing speed (up to 10 MB/sec on modern CPUs) high search speed (avg query is under 0.1 sec on 2-4 GB text collections) high scalability (up to 100 GB of text, upto 100 M documents on a single CPU) provides distributed searching capabilities provides searching from within MySQL through pluggable storage engine supports boolean, phrase, and word proximity queries supports multiple full-text fields per document (upto 32 by default) supports multiple additional attributes per document (ie. groups, timestamps, etc) supports MySQL natively (MyISAM and InnoDB tables are both supported) The important features which have been adopted to perform a full text search are the provision of the Java API to integrate easily with the web application and considerably high indexing and searching speed with an average of 4-10 MB/sec & 20-30 ms/q @5GB,3.5M docs(wikipedia) Sphinx Terms & How It Works The fist principle part of sphinx is indexer. It is solely responsible for gathering the data that will be searchable. From the Sphinx point of view, the data it indexes is a set of structured documents, each of which has the same set of fields. This is biased towards SQL, where each row corresponds to a document, and each column to a field. Sphinx builds a special data structure optimized for our queries from the data provided. This structure is called index; and the process of building index from data is called indexing and the element of sphinx which carries out these tasks is called indexer. Indexer can be executed either from a regular script or command-line interface. Sphinx documents are equal to records in DB. Document is set of text fields and number attributes + unique ID – similar to row in DB Set of fields and attributes is constant for index – similar to table in DB Fields are searchable for FullText queries Attributes may be used for filtering, sorting, grouping searchd is the second principle tools as part of Sphinx. It is the part of the system which actually handles searches; it functions as a server and is responsible for receiving queries, processing them and returning a dataset back to the different APIs for client applications. Unlike indexer, searchd is not designed to be run either from a regular script or command-line calling, but instead either as a daemon to be called from init.d (on Unix/Linux type systems) or to be called as a service (on Windows-type systems). I am going to focus on Windows environment so later I will show you how we can install sphinx on windows as a service. Finally search is one of the helper tools within the Sphinx package. Whereas searchd is responsible for searches in a server-type environment, search is aimed at testing the index quickly without building a framework to make the connection to the server and process its response. This will only be used for testing sphinx from command – line and with respect to application’s requirement; searchd service will be used to query the MySql Server with a pre created index. Installation on Windows So now we come to the part of installing Sphinx on Windows: Download Sphinx from the official Sphinx download site i.e http://sphinxsearch.com (I downloaded Win32 release binaries with MySQL support: sphinx-0.9.9-win32.zip) Unzip the file to some folder, I unzipped to C:\devel\sphinx-0.9.9-win32 and added the bin directory to the windows path variable Well Sphinx is installed. Nice, simple, easy. Later I will tell how to set up indexes and search. Sample Application Till now I guess the whole motto of this article is clear to you, let's move ahead to define our sample application. We all use the Address Book to search for people by using their name or e-mail address when we want to immediately address an e-mail message to a specific person, people, or distribution list. We also search for people by using other basic information, such as e-mail alias, office location, and telephone number etc. I think most of the people on this planet are quire familiar with this kind of search, so let's make outlook address book as our sample database schema. Most of the fields are mapped from microsoft outlook, the only additional column is date of joining so that we can filter our queries based on joining dates of the employees. The example that I am going to put forth will use Sphinx to search for a particluar address entry using free text search, meaning the user is free to type in anything, here is our search screen, the DOJ (date of joining) search parameter is optional. The screen is self explanatory, let's move ahead and define our database. As Sphinx works well with MySQL and MySQL being free also, lets create our db scripts around mysql database (Those who wish to install MySQL can dowload it from http://www.mysql.com) Let's create our sample database 'addressbook' mysql> create database addressbook; Query OK, 1 row affected (0.03 sec) mysql> use addressbook; Database changed Note: The fields defined in the following tables are for the purpose of learning only and may not contain a complete set of fields that microsoft address book or any similar software may provide. mysql> CREATE TABLE addressbook ( Id int(11) NOT NULL, FirstName varchar(30) NOT NULL, LastName varchar(30) NOT NULL, OfficeId int(11) DEFAULT NULL, Title varchar(20) DEFAULT NULL, Alias varchar(20) NOT NULL, Email varchar(50) NOT NULL, DOJ date NOT NULL, PhoneNo varchar(20) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; mysql> CREATE TABLE CompanyLocations ( Id int(11) NOT NULL, Location varchar(60) NOT NULL, Country varchar(20) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; It's time to put some dummy data into the table, so let's fill our tables. Our virtual company 'gogs.it' has six offices across India and Singapore as defined in the following insert script. mysql> insert into CompanyLocations (Id, Location, Country) VALUES (1, 'Tower One, Harbour Front, Singapore', 'SG'); insert into CompanyLocations (Id, Location, Country) VALUES (2, 'DLF Phase 3, Gurgaon, India', 'IN'); insert into CompanyLocations (Id, Location, Country) VALUES (3, 'Hiranandani Gardens, Powai, Mumbai, India', 'IN'); insert into CompanyLocations (Id, Location, Country) VALUES (4, 'Hinjwadi, Pune, India', 'IN'); insert into CompanyLocations (Id, Location, Country) VALUES (5, 'Toll Post, Nagrota, Jammu, India', 'IN'); insert into CompanyLocations (Id, Location, Country) VALUES (6, 'Bani (Kathua), India', 'IN'); Now comes the real stuff... The data sphinx is going to index, let's populate that as well...wooooo mysql> INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (1,'Aabheer','Kumar',1,'Mr','u534','Kumar.Aabheer@gogs.it','2008-9-3', '+911234599990'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (2,'Aadarsh','Gupta',6,'Mr','u668','Gupta.Aadarsh@gogs.it','2007-2-23','+911234599991'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (3,'Aachman','Singh',5,'Mr','u2766','Singh.Aachman@gogs.it','2006-12-18','+911234599992'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (4,'Aadesh','Shrivastav',5,'Mr','u3198','Shrivastav.Aadesh@gogs.it','2007-11-23','+911234599993'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (5,'Aadi','manav',1,'Mr','u2686','manav.Aadi@gogs.it','2010-7-20','+911234599994'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (6,'Aadidev','singh',4,'Mr','u572','singh.Aadidev@gogs.it','2010-8-18','+911234599995'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (7,'Aafreen','sheikh',4,'Smt','u1092','sheikh.Aafreen@gogs.it','2007-7-11','+911234599996'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (8,'Aakar','Sherpa',5,'Mr','u1420','Sherpa.Aakar@gogs.it','2009-10-3','+911234599997'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (9,'Aakash','Singh',4,'Mrs','u2884','Singh.Aakash@gogs.it','2008-6-11','+911234599998'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (10,'Aalap','Singhania',4,'Mrs','u609','Singhania.Aalap@gogs.it','2010-10-8','+911234599999'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (11,'Aandaleeb','mahajan',1,'Smt','u131','mahajan.Aandaleeb@gogs.it','2010-10-21','+911234580001'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (12,'Mamata','kumari',5,'Sh','u2519','kumari.Mamata@gogs.it','2009-6-12','+911234580002'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (13,'Mamta','sharma',6,'Smt','u4123','sharma.Mamta@gogs.it','2009-2-8','+911234580003'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (14,'Manali','singh',6,'Mr','u1078','singh.Manali@gogs.it','2008-6-14','+911234580004'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (15,'Manda','saxena',1,'Mrs','u196','saxena.Manda@gogs.it','2010-9-4','+911234580005'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (16,'Salila','shetty',3,'Miss','u157','shetty.Salila@gogs.it','2009-11-15','+911234580006'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (17,'Salima','happy',3,'Mrs','u3445','happy.Salima@gogs.it','2006-7-14','+911234580007'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (18,'Salma','haik',5,'Sh','u4621','haik.Salma@gogs.it','2008-6-23','+911234580008'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (19,'Samita','patil',3,'Smt','u3156','patil.Samita@gogs.it','2006-6-7','+911234580009'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (20,'Sameena','sheikh',5,'Mrs','u952','sheikh.Sameena@gogs.it','2008-8-13','+911234580010'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (21,'Ranita','gupta',5,'Mrs','u2664','gupta.Ranita@gogs.it','2008-10-20','+911234580011'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (22,'Ranjana','sharma',1,'Sh','u3085','sharma.Ranjana@gogs.it','2010-6-21','+911234580012'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (23,'Ranjini','singh',6,'Mrs','u4200','singh.Ranjini@gogs.it','2007-4-13','+911234580013'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (24,'Ranjita','vyapari',2,'Smt','u1109','vyapari.Ranjita@gogs.it','2008-1-22','+911234580014'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (25,'Rashi','gupta',6,'Mrs','u3492','gupta.Rashi@gogs.it','2006-2-2','+911234580015'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (26,'Rashmi','sehgal',3,'Mr','u3248','sehgal.Rashmi@gogs.it','2008-9-9','+911234580016'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (27,'Rashmika','sexy',1,'Mrs','u4599','sexy.Rashmika@gogs.it','2009-3-12','+911234580017'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (28,'Rasika','dulari',3,'Smt','u2089','dulari.Rasika@gogs.it','2009-1-24','+911234580018'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (29,'Dilber','lover',6,'Mr','u4241','lover.Dilber@gogs.it','2007-10-11','+911234580019'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (30,'Dilshad','happy',1,'Mr','u1564','happy.Dilshad@gogs.it','2007-4-8','+911234580020'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (31,'Dipali','lights',5,'Sh','u1127','lights.Dipali@gogs.it','2006-11-1','+911234580021'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (32,'Dipika','lamp',1,'Sh','u2271','lamp.Dipika@gogs.it','2010-12-17','+911234580022'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (33,'Dipti','brightness',5,'Smt','u422','brightness.Dipti@gogs.it','2010-9-25','+911234580023'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (34,'Disha','singh',3,'Sh','u4604','singh.Disha@gogs.it','2006-5-2','+911234580024'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (35,'Maadhav','Krishna',1,'Miss','u2561','Krishna.Maadhav@gogs.it','2007-11-6','+911234580025'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (36,'Maagh','month',5,'Miss','u874','month.Maagh@gogs.it','2008-5-8','+911234580026'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (37,'Maahir','Skilled',4,'Mr','u3372','Skilled.Maahir@gogs.it','2007-8-4','+911234580027'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (38,'Maalolan','Ahobilam',5,'Mrs','u3498','Ahobilam.Maalolan@gogs.it','2007-7-9','+911234580028'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (39,'Maandhata','King',1,'Smt','u2089','King.Maandhata@gogs.it','2009-9-3','+911234580029'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (40,'Maaran','Brave',2,'Miss','u4020','Brave.Maaran@gogs.it','2008-4-5','+9112345606001'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (41,'Maari','Rain',2,'Sh','u3593','Rain.Maari@gogs.it','2007-12-5','+9112345606002'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (42,'Madan','Cupid',4,'Mrs','u795','Cupid.Madan@gogs.it','2007-11-11','+9112345606003'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (43,'Madangopal','Krishna',3,'Sh','u438','Krishna.Madangopal@gogs.it','2007-2-19','+9112345606004'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (44,'sahil','gogna',1,'Sh','u2273','gogna.sahil@gogs.it','2007-10-7','+9112345606005'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (45,'nikhil','gogna',2,'Mr','u1240','gogna.nikhil@gogs.it','2009-9-14','+9112345606006'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (46,'amit','gogna',5,'Sh','u3879','gogna.amit@gogs.it','2006-2-8','+9112345606007'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (47,'krishan','gogna',4,'Miss','u3632','gogna.krishan@gogs.it','2010-9-20','+9112345606008'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (48,'anil','kashyap',4,'Smt','u3939','kashyap.anil@gogs.it','2010-3-15','+9112345606009'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (49,'sunil','kashyap',5,'Mrs','u3493','kashyap.sunil@gogs.it','2008-3-16','+9112345606010'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (50,'sandy','singh',6,'Mrs','u4691','singh.sandy@gogs.it','2009-6-2','+9112345606011'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (51,'vishal','kapoor',3,'Mr','u1087','kapoor.vishal@gogs.it','2010-5-13','+9112345606012'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (52,'bala','ji',5,'Mrs','u4762','ji.bala@gogs.it','2007-8-9','+9112345606013'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (53,'karan','sarin',4,'Miss','u3030','sarin.karan@gogs.it','2008-4-8','+9112345606014'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (54,'abhishek','kumar',4,'Miss','u1093','kumar.abhishek@gogs.it','2008-12-21','+9112345605001'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (55,'babu','the',1,'Miss','u1055','the.babu@gogs.it','2008-7-2','+9112345506001'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (56,'sandeep','gainda',3,'Miss','u1320','gainda.sandeep@gogs.it','2010-5-14','+9112345606301'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (57,'dheeraj','kumar',3,'Miss','u3685','kumar.dheeraj@gogs.it','2007-10-14','+9112345606091'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (58,'dharmendra','chauhan',1,'Smt','u3235','chauhan.dharmendra@gogs.it','2008-8-1','+9112345806001'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (59,'max','alan',3,'Smt','u3465','alan.max@gogs.it','2009-5-5','+9112345608011'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (60,'hidayat','khan',3,'Smt','u958','khan.hidayat@gogs.it','2007-11-18','+911234599101'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (61,'himnashu','singh',4,'Miss','u2027','singh.himnashu@gogs.it','2008-3-2','+911234599102'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (62,'dinesh','kumar',6,'Sh','u3233','kumar.dinesh@gogs.it','2008-5-9','+911234599103'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (63,'toshi','prakash',1,'Mr','u3766','prakash.toshi@gogs.it','2010-9-17','+911234599104'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (64,'niti','puri',3,'Mr','u3575','puri.niti@gogs.it','2009-11-15','+911234599105'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (65,'pawan','tikki',3,'Sh','u3919','tikki.pawan@gogs.it','2006-3-19','+911234599106'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (66,'gaurav','sharma',2,'Sh','u413','sharma.gaurav@gogs.it','2010-4-2','+911234599107'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (67,'himanshu','verma',2,'Mrs','u4732','verma.himanshu@gogs.it','2009-3-20','+911234599108'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (68,'priyanshu','verma',3,'Sh','u183','verma.priyanshu@gogs.it','2010-8-12','+911234599109'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (69,'nitika','luthra',2,'Mrs','u4259','luthra.nitika@gogs.it','2010-7-12','+911234599110'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (70,'neeru','gogna',2,'Sh','u1633','gogna.neeru@gogs.it','2010-6-23','+91532110000'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (71,'bindu','gupta',1,'Sh','u1859','gupta.bindu@gogs.it','2006-11-10','+91532110001'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (72,'gurleen','bakshi',5,'Miss','u1423','bakshi.gurleen@gogs.it','2007-7-1','+91532110003'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (73,'rahul','gupta',3,'Sh','u1223','gupta.rahul@gogs.it','2009-8-11','+91532110004'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (74,'jagdish','salgotra',3,'Mr','u12','salgotra.jagdish@gogs.it','2008-5-19','+91532110005'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (75,'vikas','sharma',3,'Smt','u465','sharma.vikas@gogs.it','2006-6-2','+91532110006'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (76,'poonam','mahendra',2,'Sh','u1744','mahendra.poonam@gogs.it','2009-12-2','+91532110007'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (77,'pooja','kulkarni',3,'Mrs','u1903','kulkarni.pooja@gogs.it','2008-10-6','+91532110008'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (78,'priya','mahajan',6,'Sh','u4205','mahajan.priya@gogs.it','2010-8-5','+91532110009'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (79,'manoj','zerger',1,'Mrs','u3369','zerger.manoj@gogs.it','2009-12-4','+91532110010'); INSERT INTO AddressBook(Id, FirstName, LastName, OfficeId, Title, Alias, Email, DOJ, PhoneNo) VALUES (80,'mohan','master',5,'Mr','u2841','master.mohan@gogs.it','2010-10-7','+91532110011'); Please note that above employee data is just a data *only data* I created using a small java programme using random number generators and reading some names file, so you may find titles getting messed up :( We next create a procedure that we will use from java to fetch records that we just inserted. DROP PROCEDURE IF EXISTS search_address_book; CREATE PROCEDURE search_address_book(IN address_ids VARCHAR(1000) ) BEGIN DECLARE search_address_query VARCHAR(2000) DEFAULT ''; SET address_ids = CONCAT('\'', REPLACE(address_ids, ',', '\',\''), '\''); SET search_address_query = CONCAT(search_address_query, ' select ab.Id as Id , ab.FirstName as FName, ab.LastName as LName, cl.Location as Location, ab.Title as Title, ab.Alias as Alias, ab.Email as Email, ab.DOJ as DOJ, ab.PhoneNo as PhoneNo ' ); SET search_address_query = CONCAT(search_address_query, ' from AddressBook ab left join CompanyLocations cl on ab.OfficeId=cl.Id '); SET search_address_query = CONCAT(search_address_query, ' where ab.id IN (', address_ids ,') '); SET @statement = search_address_query; PREPARE dynquery FROM @statement; EXECUTE dynquery; DEALLOCATE PREPARE dynquery; END; # To get records for ids 1, 6 and 7, we run following commands: call search_address_book('1,6,7'); Configuring Sphinx It turns out that it was not terribly difficult to setup sphinx, but I had a hard time finding instructions on the web, so I'll post my steps here. By default Sphinx looks for 'sphinx.co.in' configuration file to come with indexes and other stuff, lets create and define source and index for our sample application addressbook.conf (read between the lines) ############################################################################# ## data source definition ############################################################################# source addressBookSource { ## SQL settings for 'mysql' ## type = mysql # some straightforward parameters for SQL source types sql_host = localhost sql_user = root sql_pass = root sql_db = addressbook sql_port = 3306 # optional, default is 3306 # pre-query, executed before the main fetch query sql_query_pre = SET NAMES utf8 # main document fetch query, integer document ID field MUST be the first selected column sql_query = \ select ab.Id as Id , ab.FirstName as FName, ab.LastName as LName, cl.Location as Location, \ ab.Title as Title, ab.Alias as Alias, ab.Email as Email, UNIX_TIMESTAMP(ab.DOJ) as DOJ, ab.PhoneNo as PhoneNo \ from AddressBook ab left join CompanyLocations cl on ab.OfficeId=cl.Id sql_attr_timestamp = DOJ # document info query, ONLY for CLI search (ie. testing and debugging) , optional, default is empty must contain $id macro and must fetch the document by that id sql_query_info = SELECT * FROM AddressBook WHERE id=$id } ############################################################################# ## index definition ############################################################################# # local index example, this is an index which is stored locally in the filesystem index addressBookIndex { # document source(s) to index source = addressBookSource # index files path and file name, without extension, make sure you have this folder path = C:\devel\sphinx-0.9.9-win32\data\addressBookIndex # document attribute values (docinfo) storage mode docinfo = extern # memory locking for cached data (.spa and .spi), to prevent swapping mlock = 0 morphology = none # make sure this file exists exceptions =C:\devel\sphinx-0.9.9-win32\data\exceptions.txt enable_star = 1 } ############################################################################# ## indexer settings ############################################################################# indexer { # memory limit, in bytes, kiloytes (16384K) or megabytes (256M) # optional, default is 32M, max is 2047M, recommended is 256M to 1024M mem_limit = 32M # maximum IO calls per second (for I/O throttling) # optional, default is 0 (unlimited) # # max_iops = 40 # maximum IO call size, bytes (for I/O throttling) # optional, default is 0 (unlimited) # # max_iosize = 1048576 # maximum xmlpipe2 field length, bytes # optional, default is 2M # # max_xmlpipe2_field = 4M # write buffer size, bytes # several (currently up to 4) buffers will be allocated # write buffers are allocated in addition to mem_limit # optional, default is 1M # # write_buffer = 1M } ############################################################################# ## searchd settings ############################################################################# searchd { # hostname, port, or hostname:port, or /unix/socket/path to listen on listen = 9312 # log file, searchd run info is logged here # optional, default is 'searchd.log' log = C:\devel\sphinx-0.9.9-win32\data\log\searchd.log # query log file, all search queries are logged here # optional, default is empty (do not log queries) query_log = C:\devel\sphinx-0.9.9-win32\data\log\query.log # client read timeout, seconds # optional, default is 5 read_timeout = 5 # request timeout, seconds # optional, default is 5 minutes client_timeout = 300 # maximum amount of children to fork (concurrent searches to run) # optional, default is 0 (unlimited) max_children = 30 # PID file, searchd process ID file name # mandatory pid_file = C:\devel\sphinx-0.9.9-win32\data\log\searchd.pid # max amount of matches the daemon ever keeps in RAM, per-index # WARNING, THERE'S ALSO PER-QUERY LIMIT, SEE SetLimits() API CALL # default is 1000 (just like Google) max_matches = 1000 # seamless rotate, prevents rotate stalls if precaching huge datasets # optional, default is 1 seamless_rotate = 1 # whether to forcibly preopen all indexes on startup # optional, default is 0 (do not preopen) preopen_indexes = 0 } # --eof-- Once the configuration is done, its time to index our sql data, the command to use is 'indexer' as shown below. C:\devel\sphinx-0.9.9-win32\bin>indexer.exe --all --config C:\devel\sphinx-0.9.9-win32\addressbook.conf CONSOLE: Sphinx 0.9.9-release (r2117) Copyright (c) 2001-2009, Andrew Aksyonoff using config file 'C:\devel\sphinx-0.9.9-win32\addressbook.conf'... indexing index 'addressBookIndex'... collected 80 docs, 0.0 MB sorted 0.0 Mhits, 100.0% done total 80 docs, 5514 bytes total 0.057 sec, 96386 bytes/sec, 1398.43 docs/sec total 2 reads, 0.000 sec, 3.5 kb/call avg, 0.0 msec/call avg total 7 writes, 0.000 sec, 2.5 kb/call avg, 0.0 msec/call avg Note: As I told earlier that Sphinx creates 1 document for each row, as we had 80 rows in the database so a total of 80 docs are created. Time taken is also very very small, believe me I tried with half million rows and it took around 3-4 seconds :) cool isn't it? Once the index is up let's try to search few records, the utility command to perform search is 'search'. Ok Sphinx maharaj* please search for employee whose alias is u4732 C:\devel\sphinx-0.9.9-win32\bin>search.exe --config C:\devel\sphinx-0.9.9-win32\addressbook.conf u4732 CONSOLE: Sphinx 0.9.9-release (r2117) Copyright (c) 2001-2009, Andrew Aksyonoff using config file 'C:\devel\sphinx-0.9.9-win32\addressbook.conf'... index 'addressBookIndex': query 'u4732 ': returned 1 matches of 1 total in 0.001 sec displaying matches: 1. document=67, weight=1, doj=Fri Mar 20 00:00:00 2009 Id=67 FirstName=himanshu LastName=verma OfficeId=2 Title=Mrs Alias=u4732 Email=verma.himanshu@gogs.it DOJ=2009-03-20 PhoneNo=+911234599108 words: 1. 'u4732': 1 documents, 1 hits words: 1. 'u4732': 1 documents, 1 hits As you can see above this is a unique record for Himanshu. Note: You see a lot of information for the result, this is because of following line in our configuration file sql_query_info = SELECT * FROM AddressBook WHERE id=$id If you want to see less columns you need to change the sql_query_info in configuration file. Let's try another search, sphinx maharaj* please tell me which all rows have gurleen or toshi in them. C:\devel\sphinx-0.9.9-win32\bin>search.exe --config C:\devel\sphinx-0.9.9-win32\addressbook.conf --any toshi gurleen CONSOLE: displaying matches: 1. document=63, weight=2, doj=Fri Sep 17 00:00:00 2010 Id=63 FirstName=toshi LastName=prakash OfficeId=1 Title=Mr Alias=u3766 Email=prakash.toshi@gogs.it DOJ=2010-09-17 PhoneNo=+911234599104 2. document=72, weight=2, doj=Sun Jul 01 00:00:00 2007 Id=72 FirstName=gurleen LastName=bakshi OfficeId=5 Title=Miss Alias=u1423 Email=bakshi.gurleen@gogs.it DOJ=2007-07-01 PhoneNo=+91532110003 Exactly two records were returned and this is what we were expecting. The following special operators and modifiers can be used when using the extended matching mode: operator OR: nikhil | sahil operator NOT: hello -sandy hello !sandy field search operator: @Email gogna.sahil@gogna.it For a complete set of search features , I advise you to go through http://sphinxsearch.com/docs/manual-0.9.9.html#searching link. Sphinx as Windows Service Now our main aim is to use sphinx with JAVA API, so let's move towards that now, before java can utilize the true power of Sphinx, we need to start 'searchd' as a windows service so that our java programme can connect to sphinx search engine. Let's install Sphinx as a windows service so that our java program can use this daemon service to query the index that we just created, the command is : C:\devel\sphinx-0.9.9-win32\bin>searchd.exe --install --config C:\devel\sphinx-0.9.9-win32\addressbook.conf --servicename --port 9312 SphinxSearch CONSOLE: Sphinx 0.9.9-release (r2117) Copyright (c) 2001-2009, Andrew Aksyonoff Installing service... Service 'SphinxSearch' installed succesfully. Well now the sphinx is ready to serve us on port 9312 Note: If you try to install Sphinx without admin rights, you may get following error messages. C:\devel\sphinx-0.9.9-win32\bin>searchd.exe --install --config C:\devel\sphinx-0.9.9-win32\addressbook.conf --servicename --port 9312 SphinxSearch CONSOLE: Installing service... FATAL: OpenSCManager() failed: code=5, error=Access is denied. Once done you can start the service as: c:\>sc start SphinxSearch (or alternatively from the services screen, start 'services.msc' in windows Run) If some how you want to delete the service , use c:\>sc delete SphinxSearch Let's create an adapter to fetch data from the database. package it.gogs.sphinx.util; import it.gogs.sphinx.AddressBoook; import it.gogs.sphinx.exception.AddressBookBizException; import it.gogs.sphinx.exception.AddressBookTechnicalException; import java.sql.CallableStatement; import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.util.ArrayList; import java.util.List; import org.apache.log4j.Logger; /** * Adapter to fetch data from the database. * * @author Munish Gogna * */ public class AddressBookAdapter { private static Logger logger = Logger.getLogger(AddressBookAdapter.class); private AddressBookAdapter() { // use in static way.. } private static Connection getConnection() throws AddressBookTechnicalException { String userName = "root"; String password = "root"; String url = "jdbc:mysql://localhost/addressbook"; try { Class.forName("com.mysql.jdbc.Driver").newInstance(); return DriverManager.getConnection(url, userName, password); } catch (Exception e) { throw new AddressBookTechnicalException("could not get connection"); } } public static List getAddressBookList(List addressIds) throws AddressBookTechnicalException, AddressBookBizException { List addressBoookList = new ArrayList(); if (addressIds == null || addressIds.size() == 0){ logger.error("AddressIds was null or empty, returning empty list"); return addressBoookList; } Connection connection = null; CallableStatement callableStatement = null; try { connection = getConnection(); callableStatement = connection.prepareCall("{ call search_address_book(?)}"); callableStatement.setString(1, Utils.toCommaString(addressIds)); callableStatement.execute(); ResultSet resultSet = callableStatement.getResultSet(); prepareResults(resultSet, addressBoookList); connection.close(); } catch (SQLException e) { logger.error("Problem connecting MYSQL - " + e.getMessage()); throw new AddressBookTechnicalException(e.getMessage()); } catch (AddressBookTechnicalException e) { logger.error("Problem connecting MYSQL - " + e.getMessage()); throw e; } finally{ if(connection != null){ try { connection.close(); } catch (SQLException e) { logger.error("Problem closing conection - " + e.getMessage()); e.printStackTrace(); } } } return addressBoookList; } private static void prepareResults(ResultSet resultSet, List addressBoookList) throws SQLException { AddressBoook addressBoook; while (resultSet.next()) { addressBoook = new AddressBoook(); addressBoook.setAlias(resultSet.getString("Alias")); addressBoook.setEmail(resultSet.getString("Email")); addressBoook.setfName(resultSet.getString("FName")); addressBoook.setlName(resultSet.getString("LName")); addressBoook.setOfficeLocation(resultSet.getString("Location")); addressBoook.setPhoneNo(resultSet.getString("PhoneNo")); addressBoook.setTitle(resultSet.getString("Title")); addressBoook.setDateOfJoining(resultSet.getDate("DOJ")); addressBoook.setId(resultSet.getLong("Id")); addressBoookList.add(addressBoook); } } } Next we create the SphinxInstance that will parse the keywords and date range and provide us a list of Ids that matches the search. package it.gogs.sphinx.util; import it.gogs.sphinx.DateRange; import it.gogs.sphinx.SearchCriteria; import it.gogs.sphinx.api.SphinxClient; import it.gogs.sphinx.api.SphinxException; import it.gogs.sphinx.api.SphinxMatch; import it.gogs.sphinx.api.SphinxResult; import it.gogs.sphinx.exception.AddressBookBizException; import java.util.ArrayList; import java.util.Date; import java.util.List; import org.apache.log4j.Logger; /** * Instance that will parse our free text and provide the results. * * Note: Make sure that 'searchd' is up and running before you use this class * @author Munish Gogna * */ public class SphinxInstance { private static String SPHINX_HOST = "localhost"; private static String SPHINX_INDEX = "addressBookIndex"; private static int SPHINX_PORT = 9312; private static SphinxClient sphinxClient; private static Logger logger = Logger.getLogger(SphinxInstance.class); static { sphinxClient = new SphinxClient(SPHINX_HOST, SPHINX_PORT); } public static List getAddressBookIds(SearchCriteria criteria) throws AddressBookBizException, SphinxException { List addressIdsList = new ArrayList(); try { if (Utils.isNull(criteria)) { logger.error("criteria is null"); throw new AddressBookBizException("criteria is null"); } if (Utils.isNull(criteria.getKeywords())) { logger.error("keyword is a required field"); throw new AddressBookBizException("keyword is a required field"); } DateRange dateRange = criteria.getDateRage(); if (!Utils.isNull(dateRange)) { if (Utils.isDateRangeValid(dateRange)) { // this is to filter results based on joining dates if they are provided sphinxClient.SetFilterRange("DOJ", getTimeInSeconds(dateRange.getFromDate()), getTimeInSeconds(dateRange.getToDate()), false); } else { logger.error(" fromDate/toDate should not be empty and 'fromDate' should be less than equal to 'toDate'"); throw new AddressBookBizException("fromDate/toDate should not be empty and 'fromDate' should be less than equal to 'toDate'"); } } sphinxClient.SetMatchMode(SphinxClient.SPH_MATCH_EXTENDED2); sphinxClient.SetSortMode(SphinxClient.SPH_SORT_RELEVANCE, ""); SphinxResult result = sphinxClient.Query(buildSearchQuery(criteria), SPHINX_INDEX, "buidling query for address book search"); SphinxMatch[] matches = result.matches; for (SphinxMatch match : matches) { addressIdsList.add(String.valueOf(match.docId)); } } catch (SphinxException e) { throw e; } catch (AddressBookBizException e) { throw e; } logger.info("Total record(s):" + addressIdsList.size()); return addressIdsList; } private static long getTimeInSeconds(Date time) { return time.getTime()/1000; } private static String buildSearchQuery(SearchCriteria criteria) throws AddressBookBizException { String keywords[] = criteria.getKeywords().split(" "); StringBuilder searchFor = new StringBuilder(); for (String key : keywords) { if (!Utils.isEmpty(key)) { searchFor.append(key); if (searchFor.length() > 1) { searchFor.append("*|*"); } } } searchFor.delete(searchFor.lastIndexOf("|*"), searchFor.length()); StringBuilder queryBuilder = new StringBuilder(); String query = searchFor.toString(); queryBuilder.append("@FName *" + query + " | "); queryBuilder.append("@LName *" + query + " | "); queryBuilder.append("@Title *" + query + " | "); queryBuilder.append("@Location *"+ query + " | "); queryBuilder.append("@Alias *" + query + " | "); queryBuilder.append("@Email *" + query + " | "); queryBuilder.append("@PhoneNo *" + query); logger.info("Sphinx Query: " + queryBuilder.toString()); return queryBuilder.toString(); } } Here is the interface that I will expose to the outside world (in my future article I will expose this interface as Web Service) import it.gogs.sphinx.AddressBoook; import it.gogs.sphinx.SearchCriteria; import it.gogs.sphinx.api.SphinxException; import it.gogs.sphinx.exception.AddressBookBizException; import it.gogs.sphinx.exception.AddressBookTechnicalException; import java.util.List; /** * * @author Munish Gogna * */ public interface AddressBook { /** * Returns the list of AddressBook objects based on search criteria. * * @param criteria * @throws AddressBookTechnicalException * @throws AddressBookBizException * @throws SphinxException */ public List getAddressBookList(SearchCriteria criteria) throws AddressBookTechnicalException, AddressBookBizException, SphinxException; } and here is the implementation class for the same. package it.gogs.sphinx.addressbook.impl; import java.util.List; import it.gogs.sphinx.AddressBoook; import it.gogs.sphinx.SearchCriteria; import it.gogs.sphinx.addressbook.AddressBook; import it.gogs.sphinx.api.SphinxException; import it.gogs.sphinx.exception.AddressBookBizException; import it.gogs.sphinx.exception.AddressBookTechnicalException; import it.gogs.sphinx.util.AddressBookAdapter; import it.gogs.sphinx.util.SphinxInstance; /** * Implementation for our Address Book example * * @author Munish Gogna * */ public class AddressBookImpl implements AddressBook{ public List getAddressBookList(SearchCriteria criteria) throws AddressBookTechnicalException, AddressBookBizException, SphinxException { List addressIds= SphinxInstance.getAddressBookIds(criteria); return AddressBookAdapter.getAddressBookList(addressIds); } } ok so far so good, let's run some tests now ............ package it.gogs.sphinx.test; import java.util.Calendar; import java.util.GregorianCalendar; import java.util.List; import it.gogs.sphinx.AddressBoook; import it.gogs.sphinx.DateRange; import it.gogs.sphinx.SearchCriteria; import it.gogs.sphinx.addressbook.AddressBook; import it.gogs.sphinx.addressbook.impl.AddressBookImpl; import it.gogs.sphinx.api.SphinxException; import it.gogs.sphinx.exception.AddressBookBizException; import it.gogs.sphinx.exception.AddressBookTechnicalException; import junit.framework.TestCase; /** * * @author Munish Gogna * */ public class AddressBookTest extends TestCase { private AddressBook addressBook; @Override protected void setUp() throws Exception { super.setUp(); addressBook = new AddressBookImpl(); } @Override protected void tearDown() throws Exception { super.tearDown(); } /** this should be a unique record for Himanshu */ public void test_search_for_himanshu() throws Exception { SearchCriteria criteria = new SearchCriteria(); // remember the first 'search' example?? criteria.setKeywords("u4732"); List addressList = addressBook.getAddressBookList(criteria); assertTrue(addressList.size() == 1); assertTrue("expecting himanshu here", "himanshu".equals(addressList.get(0).getfName())); } /** only two employees have name gurleen or toshi */ public void test_search_for_gurleen_or_toshi() throws Exception { SearchCriteria criteria = new SearchCriteria(); // remember the second 'search' example?? criteria.setKeywords("gurleen toshi"); List addressList = addressBook.getAddressBookList(criteria); assertTrue(addressList.size() == 2); assertTrue("expecting toshi here", "toshi".equals(addressList.get(0).getfName())); assertTrue("expecting gurleen here", "gurleen".equals(addressList.get(1).getfName())); } /** there are 16 people from jammu location */ public void test_search_for_people_from_jammu_location() throws Exception { SearchCriteria criteria = new SearchCriteria(); criteria.setKeywords("jammu"); List addressList = addressBook.getAddressBookList(criteria); assertTrue(addressList.size() == 16); } /** only Aalap, Manda and nitika are having title as Mrs and joined in 2010 */ public void test_joined_in_2010_with_title_Mrs() throws Exception { DateRange dateRange = new DateRange(); GregorianCalendar calendar1 = new GregorianCalendar(); calendar1.set(Calendar.YEAR, 2010); calendar1.set(Calendar.MONTH, Calendar.JANUARY); calendar1.set(Calendar.DAY_OF_MONTH, 1); dateRange.setFromDate(calendar1.getTime()); GregorianCalendar calendar2 = new GregorianCalendar(); calendar2.set(Calendar.YEAR, 2010); calendar2.set(Calendar.MONTH, Calendar.DECEMBER); calendar2.set(Calendar.DAY_OF_MONTH, 31); dateRange.setToDate(calendar2.getTime()); SearchCriteria criteria = new SearchCriteria(); criteria.setKeywords("Mrs"); criteria.setDateRage(dateRange); List addressList = addressBook.getAddressBookList(criteria); assertTrue("expecting 3 records here", addressList.size() == 3); } /** should get a business exception here */ public void test_without_specifying_keywords(){ SearchCriteria criteria = new SearchCriteria(); //criteria.setKeywords("Mrs"); try { addressBook.getAddressBookList(criteria); } catch (Exception e) { assertTrue(e instanceof AddressBookBizException); assertTrue(e.getMessage().indexOf("keyword is a required field") >-1); } } } How we update the Index once database changes? For these kinds of requirements, we can set up two sources and two indexes, with one "main" index for the data which only changes rarely (if ever), and one "delta" for the new documents. First Time data will go in the "main" index and the newly inserted address book entries will go into "delta". Delta index could then be reindexed very frequently, and the documents can be made available to search in a matter of minutes. Also one thing to take from this article is once 'searchd' daemon is running we can't index the data in normal way,we have to use --rotate option in such cases. For some applications where there is a timely batch update for the data, we can configure some cron job to reindex our documents in Sphinx as shown below. C:\devel\sphinx-0.9.9-win32\bin>indexer.exe --all --config C:\devel\sphinx-0.9.9-win32\addressbook.conf --rotate Capsule We asked Sphinx to provide us the Document Ids corresponding to our search parameters and then we used those Ids to fire database query. In case the data we want to return is included in Index (DOJ attribute for example in our case) we can skip the database portion, so choose wisely how much information (attributes) you want to include while you index your sql data. Well that's all ... it's time to say good bye. Take good care of your health and don't forget to vote, its a must :) - Munish Gogna
December 7, 2010
· 37,140 Views · 2 Likes
article thumbnail
Creating and Deploying JAX-WS Web Service on Tomcat 6
Some years back I had to provide a wrapper around an EJB 3.0 remote service to come up with a simple web service project that would be deployed over Tomcat and accessed in a simple http way due to some accessibility issues. Now as I cannot reveal the actual requirement I implemented that time so here I am presenting a simple demo kind of service with following signature. public AccountDetails getAccountDetails(String accountNo, SecurityToken token); The service will return the account details of a particular account number, provided the token is valid (generated using some Security module of the application). In nutshell, the client will ask for a token from Security module and then invoke this method. The service will validate the token to see if the caller can invoke the method or not? In general you should use handlers (message interceptors that can be easily plugged in to the JAX-WS runtime to do additional processing of the inbound and outbound messages) to validate the stuff, freeing implementation class from that overhead, it is just an example exercise so our implementation class will check the security token also.Sounds good.. lets move ahead. Libraries we are going to use include JAXB and JAX-WS, as both of them have sensible defaults, the number of annotations can be kept to the minimum. Also in my opinion, it is always best to develop WSDL and schemas by hand to ensure that the service contract is appropriately defined and also that the schema can be re-used (by other services) and extended if necessary. I do not prefer using annotations for automatically producing WSDL and schema at runtime so let's start with the definition, i am using inline schema as we have a very simple requirement here, though we should have schema definitions in separate xsd files. accounts.wsdl: I am not going to explain the In and outs of this wsdl , in a nutshell it defines one operation 'getAccountDetails' which takes accountNo and returns back the account details like account type, balance etc. Please note that I have also added a security header token that will validate the caller (left to the implementation of the service). Now as we are done with our wsdl and schema, let's generate the portable artifacts from our service definition. JAX-WS includes a tool that can do this for us, we will use this tool to generate portable artifacts like Service Endpoint Interface (SEI), Service and Exception classes.These artifacts can be packaged in a WAR file with the WSDL and schema documents along with the endpoint implementation to be deployed. To generate the artifacts , we run following command: wsimport C:\devel\workspace\webservice\WebContent\WEB-INF\wsdl\accounts.wsdl -p com.mg.ws -keep -Xnocompile This will create the following artifacts in com.mg.ws package. AccountDetails.java AccountDetailsFault.java AccountDetailsFault_Exception.java AccountDetailsPortType.java AccountDetailsTO.java GeAccountDetailsINTO.java ObjectFactory.java package-info.java SecurityToken.java Now as we have got all the artifacts we are ready to implement our service, the interface we need to implement is AccountDetailsPortType , so let's do it. Here is our dummy implementation class: package com.mg.ws.impl;import java.math.BigDecimal;import javax.jws.WebService;import com.mg.ws.AccountDetailsFault;import com.mg.ws.AccountDetailsFault_Exception;import com.mg.ws.AccountDetailsPortType;import com.mg.ws.AccountDetailsTO;import com.mg.ws.GeAccountDetailsINTO;import com.mg.ws.SecurityToken;@WebService(name = "AccountDetailsService", portName = "AccountDetailsPort", endpointInterface = "com.mg.ws.AccountDetailsPortType", wsdlLocation = "WEB-INF/wsdl/accounts.wsdl", targetNamespace="http://gognamunish.com/accounts") public class AccountDetailsServiceImpl implements AccountDetailsPortType {public AccountDetailsTO getAccountDetails(GeAccountDetailsINTO parameters, SecurityToken requestHeader) throws AccountDetailsFault_Exception { AccountDetailsTO detailsTO = new AccountDetailsTO(); // validate token validateToken(requestHeader); // populate response detailsTO = getDetailsFromSomewhere (parameters.getAccountNo()); return detailsTO; } private AccountDetailsTO getDetailsFromSomewhere(String accountNo) throws AccountDetailsFault_Exception { if(accountNo == null || accountNo.trim().length()==0){ AccountDetailsFault faultInfo = new AccountDetailsFault(); faultInfo.setFaultInfo("missing account number"); faultInfo.setMessage("account number is required field"); throw new AccountDetailsFault_Exception("account no missing", faultInfo); } AccountDetailsTO detailsTO = new AccountDetailsTO(); detailsTO.setAccNo(accountNo); detailsTO.setAccType("SAVING"); detailsTO.setBalance(new BigDecimal(10000)); return detailsTO; } private void validateToken(SecurityToken requestHeader) throws AccountDetailsFault_Exception { if ("83711070".equals(requestHeader.getToken()) && requestHeader.getValidTill() != null){ System.out.println("token processed successfully..."); } else { AccountDetailsFault faultInfo = new AccountDetailsFault(); faultInfo.setFaultInfo("Header token Invalid"); faultInfo.setMessage("can't help"); throw new AccountDetailsFault_Exception("invalid token", faultInfo); } } This is just a dummy implementation for illustration purpose only. Now as our service implementation is done, we proceed to package this in a war and deploy it on tomcat. Now we create a standard web.xml, which defines WSServletContextListener, WSServlet and structure of a web project. com.sun.xml.ws.transport.http.servlet.WSServletContextListener AccountDetailsServicecom.sun.xml.ws.transport.http.servlet.WSServlet 1AccountDetailsService/details Next we create a sun-jaxws.xml, defines the web service implementation class. This file is required regardless of whether we publish our web service on tomcat, glassfish or any other server. OK so far so good , let's build our application and deploy it on tomcat, here is the ant script. NOTE: jars in lib should be carefully chosen, otherwise it will make your life hell. Now let's test the application, point your browser to http://localhost:8080/account/details , if you see something like this, it means you have successfully deployed the service. Now let's test the service, we can use the client generated by the wsimport tool as : public static void main(String[] args) throws Exception { AccountDetails accountDetails = new AccountDetails(); AccountDetailsPortType port = accountDetails.getAccountDetailsPort(); AccountDetailsTO details = port.getAccountDetails(new GeAccountDetailsINTO(), new SecurityToken()); } For those who want to invoke the service using soap stuff, they can use tool like soapUI (can be downloaded from www.soapui.org), lets make some soap calls now: case: invalid security header case: valid security header By default, Tomcat does not comes with any JAX-WS dependencies, So, you have to include it manually. Go here Download JAX-WS RI distribution, you will find the wsimport tool in lib directory. Please note for running this project you can just skip this step as I have already done this step for you and included the required libraries in the lib folder of the project. I have included eclipse project for this demo, please get it from resource section. To deploy just change build.properties to point to TOMCAT_HOME and run ant build. Thats all for now ... Please provide your valuable comments or any suggestion and DON'T forget to vote :) It's Munish Gogna signing off now :)
December 1, 2010
· 68,564 Views · 1 Like

Comments

Saving development time with local EntityManager in JPA

Apr 29, 2012 · Munish Gogna

Thanks Henq for pointing it out, I was not aware of Arquillian, may be I will try it and see.
First mod_perl6 handlers ( for Apache )

May 11, 2011 · Gerd Storm

Hi Sandeep, Just posting the stuff out there and then forgetting to reply will not work, simply put a disclaimer saying "this is how I cleared the interview" which people like Alex can read out clearly before shitting. cheers, Munish
10 Tricky Java Interview Questions

May 11, 2011 · James Sugrue

Hi Sandeep, Just posting the stuff out there and then forgetting to reply will not work, simply put a disclaimer saying "this is how I cleared the interview" which people like Alex can read out clearly before shitting. cheers, Munish
Ready made Template for J2EE project (EJB+JPA+Hibernte+Richfaces) deployed on Jboss AS 5.1.0.GA

Apr 17, 2011 · Munish Gogna

point taken ... thanks :)
Yahoo!'s bet on Hadoop

Jan 13, 2011 · Kirill Grouchnikov

Nice article - I think the stage is now set :)
Creating and Deploying JAX-WS Web Service on Tomcat 6

Dec 09, 2010 · Munish Gogna

My latest artilcle on usage of Sphinxsearch using java api - http://java.dzone.com/articles/using-sphinx-and-java
Creating and Deploying JAX-WS Web Service on Tomcat 6

Dec 09, 2010 · Munish Gogna

My latest artilcle on usage of Sphinxsearch using java api - http://java.dzone.com/articles/using-sphinx-and-java
Creating and Deploying JAX-WS Web Service on Tomcat 6

Dec 01, 2010 · Munish Gogna

folks my next article will be based on Sphinx (free text search using Sphinx and Java), please let me know what ingredients you would like to have in that article? I plan to cover: 1. What is sphinx? 2. vs Lucene 3. Installing Sphinx 4. Using Sphinx in Java to handle free text search. 5. Working Code well that's all....
Creating and Deploying JAX-WS Web Service on Tomcat 6

Dec 01, 2010 · Munish Gogna

Yes you are right, it is just an example and I am just a starter in this area. I followed this approach just to make things easy to understand for people like me :(, my future articles will be based on maven 2 only. Thanks for pointing out.
Creating and Deploying JAX-WS Web Service on Tomcat 6

Dec 01, 2010 · Munish Gogna

Yes you are right, it is just an example and I am just a starter in this area. I followed this approach just to make things easy to understand for people like me :(, my future articles will be based on maven 2 only. Thanks for pointing out.
Creating and Deploying JAX-WS Web Service on Tomcat 6

Dec 01, 2010 · Munish Gogna

Yes you are right, it is just an example and I am just a starter in this area. I followed this approach just to make things easy to understand for people like me :(, my future articles will be based on maven 2 only. Thanks for pointing out.

User has been successfully modified

Failed to modify user

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: