Case Study – Scalable SOA for an Inmate Banking System
BackgroundDigital Solutions, Inc. (DSI) is focused on providing software and technology solutions for the corrections industry. This primarily means the record management software controlling a correctional facility’s population. An Offender Management System (OMS) records the daily life of an inmate from the time they are booked until the time they are released. Our system has undergone many changes throughout the years starting as a DOS application written with Clipper and using dBase. During the Windows era, our software became a Win 32 application connecting to an Oracle or MS SQL server database. And finally, in its current incarnation, it is a Java based n-tier application with a web front end and supporting multiple data sources on the back end.
A large portion of the OMS application is devoted to inmate finances. The majority of correctional facilities in the U.S. provide a way for an inmate to have a private trust account for their exclusive use while incarcerated. Money in this account can be used to pay fines and court costs, purchase items from a commissary, or purchase time on the facility’s inmate telephone system. DSI has a sister company, Inmate Telephone, Inc (ITI) which provides inmate telephone service to many of the facilities where our OMS software is installed. DSI/ITI built a large billing application in house that handled the transfers from the distributed jail or prison sites (via dial up networking) as well as allowed us to accept payments on accounts via our call center. Our live operators would take a call and manually enter credit card numbers into a standard merchant keypad. The process was very time intensive until about mid 2003 when we developed our first in house service to process credit cards. Approximately 2 years later we released a website that enabled users to manage their telephone account online.
Since DSI/ITI already had development experience building telephony systems, we decided to write an IVR system to take as many operators out of the loop as possible, plus enable our customers to check their balance and make payments 24/7. A central idea in SOA is not to duplicate services, so we decided to expose out certain functionalities of the web site as separate light weight Web services, and call these Web services from within the telephony application
The website is deployed on redundant JBoss 4.2 instances, and fronted with an Apache 2.0 web server which performs our load balancing. The web server and the JBoss blade servers are all running Windows Server variants. The JBoss blades are secured by a firewall that allows network access only into our internal Oracle databases and access into a completely separate network segment where our credit card processing service is stored.
The ITI website was never really written with future integration in mind. We used a proprietary web framework for the application that can best be explained as a cross between Struts and Spring Webflow, although neither project was actually used to build or design our platform. But since a strict separation between the business logic and presentation layer was enforced, it was not difficult to add a new presentation type for integration. The concept of an ‘action’ is central to how our platform works. An action can be thought as a discrete action of work, e.g. ‘Login’, ‘Logout’, ’Make Payment’, etc. Which actions are available to be executed at any given time are governed by the previous actions executed by the user’s session, providing a last line of defense security measure, since the only way for actions to even be available for execution is for all the proper paths up until that point to be followed.
In a web application, the user’s path is fairly easy to model in this fashion. If the user clicks on Login, verify the credentials and then present a screen to the user that reflects the various areas of the application they to which they can traverse. Once there, the user can go to the ‘Statements’ section, or the ‘Make Payment’ section, or ‘Account Profile’. Based on the user interaction, the internal state of the web application changes to reflect the new actions available to the user. This handling is analogous to the controller in an MVC framework.
The Transition to a Service-Based ArchitectureAdapting this framework to a Web service model was pretty straightforward; however, there is no state associated with the Web services, so we flattened out the action model, removing the hierarchy of available actions. And since the expected deployment of this Web service was limited to our own applications, we chose to eschew complicated standards such as SOAP. Quite simply, we were going to mimic what happens when a user clicks on a link or submits a form in our application. Instead of returning back an HTML encoded page, we’d use simple XML, without dtd’s or schema validation.
This approach opened another can of worms regarding security. A side effect of flattening out the hierarchy of actions would be that the ‘Make Deposit’ command would be available without having to login. Even though a credit card was required to actually make a deposit, we still didn’t want people circumventing the login page for this. Never underestimate the desire for individuals to cheat the system!
One answer to this security problem ended up being quite simple because of how the application was designed and deployed. Since we were using an action that was sent in as part of a parameter set, there was only one servlet endpoint to contend with. Based on our use cases, there was no need for further granularity of the rights; i.e. if you could access the servlet, you could execute any of the actions. The easy answer was to simply configure the Apache server that was acting as our load balancer to only allow requests to that particular servlet from within our local domain. We added a location directive from within our Virtual Host configuration to accomplish this quite easily:
<Location /servletName >
Allow from 172.22.0.0/16
Allow from 192.168.0.0/8
Deny from All
This configuration worked fine for us, until the need for a new product pushed the requirements of the Web services to outside our LAN.
Distributed Data Synchronization
The new product was for inmate banking. Specifically, a member of the general public could place money on an inmate’s account via a web site, or via a walk up to a kiosk in the facility’s lobby. We faced quite a few hurdles setting this up. The first was the fact that each facility had its inmate database locally, and there was no central repository for the data, nor a shared network. Many of these facilities are distributed throughout the country, and most are not even on a state or county level network. Since there was no way to make a business case for providing private networks to these facilities, we had to rely on the internet via readily available and inexpensive DSL and Cable circuits.
Many of the specifics regarding the synchronization between the facility and our data center were postponed until later in the process, but what we did know was that 1) The inmate data would be populated into our database somehow, and 2) We would export the payments made to an inmate at a particular facility somehow. So we blissfully went about creating our application in the same model as the Inmate Telephone Deposit system. We used Apache on the front end of a few JBoss instances running on a few blades. Mod-JK and the AJP protocol were used between Apache and the blades, just like before.
Once we completed substantial development of the web site, we turned our attention back to the data synchronization and service piece. Again, we flattened the action model, exposing a single servlet that could respond to any variety of actions. For this case, we had just three actions. One to add/edit an inmate, one to get untransferred deposits and a third to change the status of a deposit to transferred. That would be enough for synchronizing the inmates and moving the deposits, but we also wanted to have the ability for a deposit to be made remotely. We had concurrently developed a kiosk appliance that could accept cash and credit cards to make a deposit. The appliances were relatively dumb, and were expected to get all of the relevant data from a remote source, as they had no internal database. For this service we had actions like ‘Search Inmates’, ‘Get Inmate Details’, ‘Validate Credit Card’, and ‘Make Deposit’. In true SOA fashion, all the business logic was contained within a single point in our enterprise, the InmateBanker website.
We combined our list of actions together and set about writing the business logic in our application to handle all these actions. One not so insignificant benefit of doing stateless Web services like this is testing. We could speed through development and unit testing since each action was independent of the others. And since we were essentially mimicking a browser, we could perform ad hoc testing simply by typing in a URL in our browser and viewing the resultant XML.
Security with EJBCA
We embedded the authentication information within the list of parameters, preferring to let the transport layer perform the authorization, as we had done with our internal Web service. The difference here was that we couldn’t simply rely on IP addresses of the clients. We tossed out the idea of maintaining a list of our customer’s IP addresses, quickly realizing that it would be a maintenance problem long term, as some of our smaller clients may be running on dynamic addressing assigned by their provider. Additionally, since we would be letting in any request from that IP address we would essentially be opening up any machine on that LAN to the Web service.
We turned to using SSL with Client authentication. It was the perfect fit. Our Apache server already had OpenSSL compiled in, and we already had the virtual hosts set up, and the corresponding location directives within it. Requiring a client certificate was as easy as adding this location directive within our SSL enabled virtual host:
This simple directive meant that in order to access the resource, we must first present a valid client certificate. We did not wish to maintain a whitelist of allowed certificates, nor did we wish to pay a fee to a licensing authority for each license we deploy out in the field. The solution was to use our own Certificate Authority (CA). Fortunately, a fantastic open source tool, EJBCA, already completed the heavy lifting for us. We deployed EJBCA within our LAN and went about creating two different Certificate Authorities, one for the Inmate Telephone Side of our business, and one for the Inmate Banking side.
We added the directives within our Virtual Host to specify the keys for the CA which we are going to accept for the client authentication. We also specified a directive for the Certificate Revocation List, which EJBCA can create and manage for us:
The CRL is particularly interesting because it gave us the ability to instantly revoke a certificate if a kiosk machine were to be physically compromised.
Once we had the CA created in EJBCA, we could start issuing certificates from it. Each kiosk, which was a consumer of two separate services, received two certificates, one for each side of the business. Individuals could place money on their phone account, or on an inmate’s account from one single application on the kiosk. A third certificate was needed for the service that synchronized the database at the facility with our database in the home office. This application was created as a supplemental application that ran elsewhere on the facility’s LAN.
The EJBCA application essentially had two sides to it. One was for the administration and creation of certificates, and the other was for downloading of certificates to browsers, files, and via URL downloads. We exposed only the ‘public’ facing side of EJBCA outside of our firewall, leaving the administration component within the borders of our LAN. Our reason for doing this was so that the distributed applications could do as much of the work as possible in handling their own certificates.
Both the kiosk and our synchronization piece (internally called the IEngine) were written in Java and used the HttpClient project for much of the web communication components. HttpClient does not have native support for client certificates, however the authors were nice enough to provide reference samples that work almost out of the box, AuthSSLProtocolSocketFactory.java, and AuthSSLX509TrustManager.java. We wrote a wrapper class that used these classes along with the HttpClient Protocol class.
Our Java looked like this in our helper class:
/*First two parameters are for client-authentication, second two parameters are for server-authentication*/
AuthSSLProtocolSocketFactory ssl = new AuthSSLProtocolSocketFactory(
Protocol authhttps = new Protocol("https",
/*Build the actual client and try to connect*/
HttpClient client = new HttpClient();
The code then proceeded to use the client object as one normally would use it. We created a GetMethod object based on the remotely deployed server name, and used a NameValuePair array to pass the parameters.
After the request was submitted, the response code was examined. A 200 response meant everything was well and the response can be passed up to the application for handling. This didn’t necessarily mean that the actual content of the request was valid, only that the server and client handled everything as instructed.
If a 403 forbidden status code was returned, it meant that Apache denied our certificate. We expected the most common cause for this was a revoked certificate; however, after production deployment we found out it was something else, I’ll get to that a little later.
Receiving a Forbidden code would trigger the application to see if a new certificate had been issued for this particular client. Each device had a specific ID assigned to it, and it used this ID to call the EJBCA Web service to try to download a new certificate. If there is a new certificate available for this client, it is saved on disk, overwriting the previous. Otherwise, the individual using the kiosk is given an out of service error and the kiosk can no longer function normally. Since EJBCA allows a certificate to be downloaded just once, it requires manual intervention on the part of our Operations Staff to reissue a certificate. But once it is reissued, the client will automatically download and save it. Having the manual intervention for reissue is a critical step in security, as it is an abnormal behavior and really should be investigated.
The Perils of Caching
Going back to the 403 error, we quickly realized that this was not being caused by a denied certificate – it was related to a cache setting in Apache. Apache can use an internal cache to speed up SSL processing. A directive, SSLSessionCache, controls how this is performed. The valid options are an in-memory cache, on disk cache, and none. The Apache documentation states that these caches are used to speed up parallel requests from a client. However, there are a few documented problems using the caches with client certificates, as evidenced by Google searches. The issue went away when we changed the directive to not cache. The Apache documentation states that this causes a noticeable performance penalty. However, I’d prefer the code to work correctly, rather than fail faster! The issue would manifest itself by having Apache deny the handshake phase of the communication. The Java application running on the client side would get confused even further, and only a restart of the Java app would cure it. I never tracked it down to the point where I could narrow down whether one side was the initiator of the corruption, since a restart of the Java application caused a blank slate on the Java side, and a new connection and session within Apache. All I know is that eliminating the SSLSessionCache made my problems go away, and that’s a good thing.
Let me do another diagram and summarize what we were able to do without much difficulty, and pretty quickly with a time and resource constrained development team.
Key to our entire infrastructure is judicious use of the Apache Web Server. Apache provides frontline security by protecting the resources behind it utilizing a combination of access control rules and client-based certificates. At the same time, Apache serves as the front end for two independent e-commerce sites that are open to the general public. And although Apache is a single point of failure, it is completely stateless in this architecture, meaning it can be replaced with a passive spare by flicking a switch, causing little to no downtime to the enterprise.
Within a special DMZ zone of our network lies two sets of Java EE applications running on redundant servers. These applications handle requests from the kiosks in the field, as well as general public usage over the internet. The applications are oblivious to the security in front of them and are concerned only with performing business logic. Two additional services round out the infrastructure. The first handles all credit card processing in a secure network that is separate from the application server LAN, corporate LAN, and any other LAN within our walls. The final critical component is the EJBCA certificate server which enables the entire secure service architecture.
At this point, we have kiosks deployed in a geographically large area from New Mexico to Maine. The kiosk devices can be shipped from our office to the customer and are plug-n-play from the customer’s perspective. Most importantly, customer satisfaction for the kiosk is very high as it reduces their workload and allows the facility to focus on the central mission of protecting the public and the inmates contained within its walls.
Development was completed and the first kiosk was deployed within a 9 month timeframe, primarily by a single development resource being available at any given time. This includes the time to develop the kiosk software, server side application logic, and deployment design.
ConclusionWhile our approach may not be classified as a classic SOA design, due to the fact that we didn’t buy someone’s product to do it, it embodies many of the desirable attributes that SOA strives to achieve. Business logic is centralized to a group of core servers that provide discrete, well-defined actions for clients to invoke. The services are loosely coupled allowing multiple versions of kiosk software and Web site software to coexist peacefully. This architecture also provides a standardized way of controlling access to resources, via a well-known product, Apache. The same architecture was overlaid over two distinct business processes, giving both of them a common point of interaction, even though their purposes are different. The kiosk device, as a consumer of the services, integrates the two into one common interface, while our phone IVR provides yet another interface into the same data.
As we continue to add additional clients to our infrastructure, we have yet to see the point where we have saturated any component of our system. If I had to make a guess, I would think the Apache server would be the first point where we would start to experience load. At that point, we will probably add a second Web Server that is an active clone of the first, and place hardware load balancers in front of them. But until that point, DSI/ITI will continue to get the best bang for the buck out of our design.