Agile Message Management with C24 and MongoDB
Join the DZone community and get the full member experience.
Join For Free
AUTHORS
Matt Vickery - C24 Technologies & Incept5 Daniel Roberts - 10gen
Iain Porter - C24 Technologies & Incept5
ABSTRACT
C24 and mongoDB - Agile Message Management. This
article presents and demonstrates a natural combination of two
enterprise software products - C24 Technologies C24 iO Studio and
10gen's mongoDB. Both technologies will be introduced, key feature sets
outlined and a practical demonstration of their combined capability
provided through working code. The primary driver for promoting these
two technologies is that they form a powerful toolset that has
underpinned agile software delivery for several significant enterprise
applications that required non-trivial messaging and data
storage capability.
INTRODUCTION
Both C24 iO Studio and mongoDB are enterprise software products that in
their respective technology fields lead the way. This article will
articulate their primary features and the reason that the technology
support that they provide has become increasingly significant. It will
start to become obvious that a class-leading, robust, message parsing
and transformation capability coupled with a document-oriented database
is a compelling technological combination.
Furthermore, both of these products are highly flexible from an
architectural perspective. Application software can make use of both
technologies through simple to use APIs, which is shown in the sample
project presented in this article. Both C24 iO and mongoDB have
wide support amongst the Spring community and so both can be used in a
Spring container context. Both products also enjoy support from the
enterprise software community and can be easily integrated, as an
example, into SpringSource's Spring Integration, MuleSoft's
Mule, RedHat's Fuse and Apache's Camel integration platforms using
adapters that those vendors have implemented and provide to their
customer base.
This article will present typical challenges facing technology
architects whom are building applications that need to take advantage of
technologies that support gaining a competitive advantage through agile
construction and rapid delivery. This article will concentrate
on advocating the right tools for that challenge.
TYPICAL DATA INTEGRATION CHALLENGES IN SOFTWARE SYSTEMS
Inherent in many Message Driven Architectures (MDA) is a messaging
toolkit that must deal with both simple and complex messaging, whether
it be, for example, FIX and SWIFT from the financial industry, ACORD
from the insurance industry or SS7 from the telecommunications industry.
Message Parsing & Business Validation Rules
Building software around message standards (whether formal, de facto or
custom) has generally proven to be highly error-prone and
costly. Building software parsers is error prone because it is highly
complex. Whilst message parsers are necessary to translate raw messages
into well-defined message models, without validation capability, a
message parser has limited use. Therefore, software is also required to
support the application of business rules so that they can be applied to
form a statement of validation for each message.
Building software parsers and validation rule capabilities is costly
because many of these standards are frequently updated and you then have
to keep pace with that change. Keeping pace generally means updating,
testing and releasing code. Furthermore, for some message standards,
compliance failure can be very costly, for example SWIFT compliance
failure is calculated as a fixed value fine per message.
Message Transformation
A typical MDA use case is one where messages must be transformed from
one message type to another. Even using, what appears to be direct
field mapping, data often has to go through a process of cleansing,
enrichment and type change. On a technical level, and as an example,
many developers have had experience of trying to resolve differences
between values based on different type systems. Because we have
different type systems, an XSD date type is different from a JDBC date
type is different from a Java date type is different from an ISO-8601
date type. Further examples of type system variance exist around numeric
types; this can be particularly serious for financial applications.
Although source messages often require some type of validation, an
entire set of rules is often required to validate target transformation
messages as well. The most agile and efficient approach would be to use a
single mechanism.
The primary point made in this section is that data transformation is
never as easy as one might imagine. Tooling support, provided by experts
in the field, not only provides a competitive advantage through
improved time-to-market deliveries but also saves costs associated
with maintenance and getting parsing and validation wrong.
Key Motivators
With a motive to provide agile, robust and rapid time-to-market software
solutions, we are going to advocate using an industry leading messaging
toolkit instead of building bespoke parser, transformation and
validation rule capability.
Message Storage
Another key technology employed in a typical MDA solution is one that
provides message persistence. Most enterprise applications are required
to store messages for at least, a short period of time. That storage
requirement may be triggered on message entry to the system or
subsequent to having been processed through business logic.
A number of short-term storage technologies and strategies are available
for selection. Any database must be able to perform adequately, cope
with schema change easily, scale-out to commodity server infrastructure
and not cost more than the servers on which they run.
Regarding long-term storage, and as an example, some applications used
by the finance industry have requirements to store messages for a number
of years in order to meet regulatory requirements. Some fairly typical
problems associated with this requirement are the cost of database
licenses and servers (a distinct archive DBMS is typically employed),
scalability for a growing data storage requirement and having capability
to cope with a statically defined schema being used within an
evolving business.
A number of vendors compete in this technology space. An important
key differentiator for choosing a technology, and probably the first to
consider is, fit-for-purpose. Although it is entirely possible to take a
message and build a 4th normal form relational model and then a schema
design from it, enterprises are beginning to demand access to
technologies that are more suited to agile delivery and storage in a
native structure.
Following on from fit-for-purpose requirements, the chosen storage
technology must support performance requirements, schema change through
business evolution, horizontal scalability (scale-out) and have a
reasonable license cost.
BREAKING AWAY FROM THE RDBMS - Agile operational data stores with flexible dynamic schemas
Introduction
During the last fifteen to twenty years, the Relational Database
Management System (RDBMS) has provided a capability that has seen it
become the dominant storage technology in the enterprise application
market. Vendor offerings in this space have become rich and plentiful;
they include Oracle RDBMS, IBM DB2 and Sybase ASE amongst others. Other
lesser known vendors have also appeared within the last 10 years but are
founded upon the same roots; a typical relational model implementation
along with a standards based Structured Query Language (SQL) provision.
Static Versus Dynamic Schemas
Modern software development requires a highly efficient response to
requirements change; this is usually achieved via an agile development
process. In order to be most successful, experience has proven that
agile development needs to be supplemented and facilitated with agile
products, frameworks and tools.
A great example of this is the significant development advantage that
can be leveraged by using dynamic schema capabilities such as those
provided by document databases. In order to express entities in a
document database, rather than formally specifying tables and attributes
in a static Data Definition Language (DDL) script, the document is the
schema.
The dynamic schema capability also means that costly data migrations are not mandatory and maintenance burdens are eased.
Scalability
Most relational database vendors provide vertical scaling capability.
Vertical scaling, or scale-up, has proven to be very costly as it is
usually provided by purchase of bigger servers with more CPU, RAM,
faster storage and high-speed networking. The subsequent cost
of migration of data to the new bigger, faster server is also sometimes
substantial. If you have a requirement to scale-out to multiple servers
because of growing data requirements it becomes increasingly complex
and consequently, increasingly expensive. Furthermore, and from a
technical perspective, RDBMS solutions are table driven which makes
high-performance access along with sensible data distribution very
difficult to achieve.
Conversely, document databases naturally support scale-out, data is
generally co-located, and partitioning (or sharding) data across
distributed nodes is generally far less complex. Furthermore, rather
than requiring high-powered servers for scale-up, document databases can
be scaled-out using commodity hardware.
Performance
Regarding both document and relational databases, two aspects of
performance are interesting. The first aspect can be described by
considering a query that must navigate a single object versus one that
must traverse tables that need to be joined prior to projection. Reading
data from a single relational table is very fast but as soon as joins
across tables are required, performance significantly decreases compared
to the equivalent operation using a document database.
The second interesting aspect of performance can be described by
considering a growing installation of commodity hardware servers that
are being used to house big data sets. Because documents are stored
without undergoing normalization, it's easy to distribute documents
across very large clusters of servers and provide linear scalability for
both reads and writes.
ORM Layers
For some types of application, a key challenge that faces the relational
model (or at least its vendor implementation) is the much discussed and
debated 'impedance mismatch'. This challenge is typically overcome by
the introduction of a new layer that brokers between application objects
and relational tables - the Object Relational Mapping (ORM) layer.
ORM layers are key in the technology stack but only exist because two
technologies don't naturally fit together. Rather cleverly, some RDBMS
vendors extended their business around the necessity for ORM, consider
Oracle's TopLink product for example. As an ORM product, Hibernate
became popular but many developers that required reasonable levels of
performance discovered that it performed very poorly compared to their
application software - it became a bottleneck. Relational tables and
joins have become such a key aspect of performance that very specialist
knowledge is required to design schemas, scale horizontally, write
(distributed and non-distributed) queries and tune databases around
them. Performance is further complicated by the introduction of an ORM
layer - you certainly can't ignore it.
Non-relational Data
As an extreme example of an application that does not fit typical
relational database facilities, designing a relational schema that would
store FpML or SEPAs ISO-20022 messages would require entities with
thousands of attributes spread across a huge number of tables. This is
completely impractical unless the chosen storage design treats
the message as a complete entity or document.
Costly & Heavyweight Product Sets
As RDBMSs have matured, some large vendors have added extensive
capability and built large organisations around their products. The
extensive capability now includes not just the core RDBMS but modelling
and design tools, management and monitoring tools, BI tools, cluster
management tools, analytic tools, public issue tracking and support
tools and highly skilled professional services. This software tool
capability has to be funded and that's typically done through licencing.
Some of the RDBMS products have become so complex that vendor PS is
required to design and tune them.
Architectural Results
Many customers requiring data storage for their documents are writing
software against a technology that forces the impedance mismatch to be
resolved. They resolve this technology mismatch by introducing an ORM
layer in their application, merely because they are using two
technologies that don't fit together. Furthermore, customers then
purchase an RDBMS from a mainstream vendor that has built a global
company around extended capability, much of which is not required for
simple document storage. Expensive consultancy is often required to get
the performance that customers need. Scalability is often restricted to
scale-up because data has been normalised into tables that can't be
easily distributed. This, in turn, removes potential for using commodity
hardware to scale-out.
A New Paradigm
The software industry is undergoing somewhat of an evolution in the
thinking around the fundamentals of storage technologies. For projects
that require a different fit between application messaging and the
underlying storage technology, document databases look very attractive
and are gaining significant traction in today’s agile driven market.
Several very interesting aspects arise from this:
- Dynamic schemas fit well with agile development and lessen the project development and maintenance burden.
- Object graph traversal type queries are computationally cheaper for document databases than the equivalent using relational structures.
- Scale-out, using commodity hardware is a more cost effective approach than typical RDBMS scale-up.
- An ORM layer is not necessary; there is no impedance mismatch problem to solve.
- Large, expensive product sets are not required - purchase and use only what you need.
- Queries can be expressed in much more simple terms, normalization is not exposed to the query writer through having to join tables together to build documents.
From RDBMS to Document Databases
There's no doubt that developers coming from an RDBMS & SQL
background will be challenged (albeit briefly) attempting to
understanding this new technology - it is a significant and fundamental
paradigm shift. However, learning to use document databases is proving
to be very compelling to developers because it delivers results quickly
and supports change instantly through dynamic schemas. Installation and
setup is usually trivial.
C24 INTEGRATION OBJECTS (IO) - DATA MODELLING AND MANAGEMENT
C24 Technologies is a software house specialising in standards-based
messaging and integration solutions aimed at the wholesale financial
services markets. C24 Integration Objects (C24 iO) Studio is a data
modelling, meta-data management, transformation and messaging
integration toolkit based on Java binding technology.
C24 iO Studio can be found in production use at more than twenty blue chip financial services customers worldwide.
Major features that result in C24 iO Studio being one of the leading players in its market space are:
- Graphical based message model construction using typical XSD like syntax, complex types, simple types and attributes.
- Graphical based transformation construction using source and target messages. Drag and drop mapping links between source and target messages and apply functions on those links to apply updates, enrichment and type conversions to message fields. A large palette of functions is available for use by transformation designers.
- Out-of-the-box standards library support, this means that you can start processing complex financial messages without writing a single line of custom parser code. Furthermore, financial messaging standards can be enforced using C24 iO validation rules; these are also all included out-of-the-box. C24 Technologies maintain these standards libraries throughout the year, when new standards are published they update the models and release them to the customer base.
- Rich validation facilities, for standards libraries or custom models, a set of rich validation languages exist that mean you can go well beyond the capabilities of technologies such as the XSD constraint language.
- Each and every one of the standards based message models are tested to a degree that would surprise even the most test oriented developer.
10GEN MONGODB
MongoDB (from "humongous") is a scalable, high-performance, open source
NoSQL database. 10gen has an extensive list of customers that use
MongoDB across a number of vertical industries, including, but not
limited to: SecondMarket, Athena Capital Research, Equilar, SAP, MTV and
craigslist.
Key Features:
- Document-oriented storage - JSON-style documents with dynamic schemas.
- Full Index Support - Index on any attribute, just like you're used to.
- Replication & High Availability - Mirror across LANs and WANs for scale & reliability.
- Auto-Sharding - Scale horizontally without compromising functionality.
- Querying - Rich, document-based queries.
- Fast In-Place Updates - Atomic modifiers for contention-free performance.
C24 IO AND MONGODB TECHNICAL SAMPLE
Introduction
This section is a deep technical dive into a sample application that
demonstrates one potential mechanism for coupling C24 iO and mongoDB.
The sample uses a scenario that's manufactured to resemble a high-level
business operation.
Scenario
Before that technical deep dive, it would be useful to understand the
scenario from a business perspective. The essence of it is that a client
of a brokerage firm places orders, the broker then fills those orders.
Each client order (NewOrderSingle) may give rise to one or more
execution reports (ExecutionReport); orders can be filled with a single
execution or multiple individual executions.
Sample Messages
In order to support that scenario, a set of FIX NewOrderSingle and
ExecutionReport messages, which were generated from a front office
simulator, will be saved into a mongoDB database. All of the messages
used in this sample are provided as static data and are contained on two
files that can be found in the sample project
source (src/main/java/resources).
The ExecutionReports that are loaded represent simulated processing for
the collection of the NewOrderSingle messages. The core driver for this
sample was to demonstrate financial messages being parsed by C24 iO,
saved to a mongoDB database and then queried using mongoDB query
facilities.
Inbound Message Delivery
In a typical production scenario, messages would be received via a JMS
Queue or a File Reader as raw FIX they then undergo a series of
operations:
- Message Parsing - C24 iO binds each message to a C24 Java FIX object, this code is provided by the C24 FIX libraries, no custom code is required.
- Message Validation - Once parsed (bind), the message is validated to ensure that it is semantically correct.
- Message Transformation - Following validation, the message is converted from a C24 Java object to a mongoDB object.
- Message Persistence - The mongoDB object is then saved to MongoDB.
Sample Project Distribution
The sample project has been distributed in two forms, the source is
available on Github at:
https://github.com/C24-Technologies/c24-sample-mongo-trading. The first
distribution form is for Internet-enabled environments. Cloning the
Github project and running the usual 'mvn clean test' will download
dependencies, compile all of the application code and run the
integration test classes. The second distribution form provides the
sample project as a package that can be run in a non-Internet enabled
environment. The package is distributed as a zip file that needs to be
unpacked and run using the supplied ant build file or shell script. The
ant build file contains several interesting targets; they are clean,
compile, createNewOrders and createExecutionReports. Each of these
targets needs to be run in turn in order to populate the database with
data necessary for the queries to be executed. The default target
invokes all targets in the correct order automatically and so running
'ant' in the root directory of the project will complete the task.
Running the shell script './run.sh' will also achieve the same result.
Service Dependencies
Whichever project is executed, a mongoDB database must be running and
available for service. The directory src/main/java/resources contains a
database configuration file named mongoDB.properties. Connection
parameters for the mongoDB database that you plan to use need to be
configured in that file. Although you would always use authentication
credentials in a production system, none are necessary for this sample.
All that is required is the server (host) name, database name and port
number.
Spring Framework C24 iO Configuration Classes
As the Spring Framework is a popular IoC offering, C24 iO objects used
in this sample are configured using Spring Configuration classes. The
class biz.c24.io.mongodb.fix.configuration.C24Configuration contains all
of the C24 beans plus a Spring PopertyPlaceHolderConfigurer (not
shown) that provides access to an external property file (it contains
the database details). Within this configuration class are several key
bean creation methods. The use of these beans will be discussed in the
sections below.
Spring Framework mongoDB Configuration Classes
The Spring Framework creates all of the mongoDB Java objects used in
this application during container instantiation. The configuration class
is very simple and is as follows.
The key mongoDB properties (database, port & server) have been
loaded by the Spring PopertyPlaceHolderConfigurer bean defined in the
C24Configuration class. The Spring bean created by getMongoDB()
creates a connection to the database. The mongoDBTemplate bean provides
access to the mongoDB database instance through the normal Spring
template mechanism.
Running The Project
Inbound Message Delivery
For the purposes of this demonstration the data is going to be loaded into MongoDb
via two data loader classes:
- biz.c24.io.mongodb.fix.application.NewOrderSingleDataLoader
- biz.c24.io.mongodb.fix.application.ExecutionReportDataLoader
These classes load the data from the files in src/main/resources/data-fixture by reading a single line at a time.
Message Parsing
Parsing the String that represents the FIX message requires two classes:
- The source parser responsible for parsing the message
- The object class to populate
The C24 Parser
Each C24 iO parser extends the abstract
class biz.c24.io.api.presentation.Source, the FIX parser class is
FIXSource. A single instance of this class is required and so the
default Spring bean creation options are used.
C24 iO biz.c24.io.api.data.Element objects are used to tell the
C24 iO parsers (FIXSource in this sample project) which elements that
the caller wants the parser to extract from the message during
parsing. These two element beans will be used to tell the FIXSource
parser that the caller wants to receive an object representing a
NewOrderSingleMessage and also an ExecutionReportMessage.
Two utility beans have been created that actually perform the role of
using C24 iO code to parse and validate raw FIX messages; it is
boilerplate code and so has been captured within a Spring like C24
template (C24ParseTemplate) class.
Message Validation
C24 iO validation rules are defined on each message model, although of
course, they can be shared or re-used. Validation rules are invoked
through use of a validation manager. Again, a single instance is
required which gives rise to the following configuration.
Message Transformation
Once the message has been parsed into a Java C24 ComplexDataObject it is
transformed into to a mongoDb object prior to being persisted. See the
method asMongoDbObject() in the class
biz.c24.io.mongodb.fix.application.C24ParseTemplateImpl.
Message Persistence
Messages are persisted through use of Spring’s MongoTemplate.
Application Execution
Application code can be invoked in two different ways, firstly through
an application launcher and secondly through an integration test. This
section will follow application launcher code through the new order
creation process.
The application invocation sequence is as follows:
- Through the createNewOrders() method, the CreateNewOrderSingle class code loads the Spring container through specification of a context loader. The only configuration that needs to be loaded is the MongoDbConfiguration class context, the C24Configuration.class is loaded as a direct dependency using an @Import({C24Configuration.class}) statement. The Spring context loader reads the two Spring @Configuration classes and creates all of the beans that have been defined.
- The createNewOrders() method gets the mongoDb template bean [line 35], as well as the C24ParseTemplate bean [line 36] from the Spring container. The same method then loads a file that represents a sample set of FIX NewOrderSingle messages from a source located on the project classpath. The method is now setup for work.
- The file containing the FIX messages contains multiple messages, one per line. The C24ParseTemplate class is called to perform parsing of each raw FIX message into a C24 iO java object [line 45], this is the bind() method invocation.
- The validation manager validates the message is semantically correct [line 46]
- The C24ParseTemplate converts the C24 iO object into a mongoDB object [line 47].
- The mongoDB template is invoked with two parameters, the mongoDB object and the other is the name of the collection for which to add the new document object [line 47].
The interesting section in this last code snippet is the code that
parses the raw FIX message [45] and the code that writes it into the
mongoDB database [47]. The C24ParseTemplate code takes a raw string
type message, performs some basic checks and parses that message into a
C24 iO Java object. There are two key methods in this class, the one
that binds the string to the Java object and the one that converts the
Java object into a mongoDB object.
In the bind(ComplexDataObject) method, Lines 21 and 22 show the
simplicity of using C24 iO code to parse a raw string into a C24 iO Java
object. The parser (or source) has a reader set on it. The reader
supplies the raw message as a string. At the moment that readObject(...)
is called, the parser looks for an instance of the element in the
string format message and passes it back to the caller as a C24 iO Java
object. Note that All C24 iO message objects extend the class
ComplexDataObject. This provides the significant benefit that all
C24 iO messages can be handled as a single type.
The other interesting method in this class is the asMongoDBObject(...),
this takes a C24 iO Java object (ComplexDataObject) and converts it to a
mongoDB document object. C24 iO Java objects can be emitted in any
format that's supported for a message, JSON, XML, CSV, Java Source etc.
The C24ParseTemplate class above contains an example of how you could
format any C24 iO Java object as XML, no transformation necessary, this
is merely a formatting option.
The save() method of the mongoDB template is compiled code and
distributed as part of the mongoDB Java driver. The MongoTemplate
contains all of the methods that you typically need for inserting,
updating, deleting and querying a mongoDB database and its
collections. The method call in this example accepts a mongoDB object
but also the name of a collection, if the collection does not exist, it
will be created on behalf of the caller. Once the save(...) method has
been called, the FIX documents will now be present in the database. A
simple mongoDB findOne() query on the NewOrderSingles collection
reveals the following results:
Retrieving results through use of a simple query is useful for checking
that data is inserted whilst developing an application. However, when it
comes to accessing production data, the task will be approached with a
different target goal.
SUMMARY
- This article began by exploring typical challenges for message toolkits and persistence mechanisms in today's software market considering the necessity for low-cost, high-performance, scalable, robust and agile tools.
- Two key enterprise technologies were introduced, C24 iO Studio and mongoDB, that together, form a powerful partnership within Message Driven Architectures. Together they provide a fully featured messaging toolkit and document-oriented data storage.
- Key features of each product were explored along with driving forces that lead to their existence in today’s software market.
- An example implementation using both technologies has been created, presented in this paper and distributed through two mechanisms; for Internet and non-internet enabled environments.
REFERENCES AND SOURCES
C24 INTEGRATION OBJECTS (IO)
To learn more about C24 Technologies and C24 Integration Objects
including datasheets, customer successes and reference implementations,
please visit www.c24.biz.
MONGODB
To learn more about 10gen and mongoDB, please visit www.10gen.com and www.mongodb.org.
Database
MongoDB
Relational database
Spring Framework
agile
Big data
Object (computer science)
application
Software development
Published at DZone with permission of Matt Vickery, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Boosting Application Performance With MicroStream and Redis Integration
-
How To Design Reliable IIoT Architecture
-
Effortlessly Streamlining Test-Driven Development and CI Testing for Kafka Developers
-
Hyperion Essbase Technical Functionality
Comments