Various optimization techniques exist to address runtime service performance concerns. In this article, Cory Isaacson explores these established design options and contrasts them with software pipelines, a new technology developed to accommodate service usage demands without compromising the flexibility and agility that is so important to achieving strategic SOA goals.
As organizations continue to accelerate growth, business applications are consistently required to run faster, support more users, and process more transactions. With limited budgets and no easy way to expand data center capacity, IT executives are faced with the increasingly difficult challenge of squeezing more performance and greater utilization out of existing systems - all while delivering greater flexibility so that systems can be adapted in response to rapidly changing business needs.
These challenges are driving today’s IT executives to:
|•||Find faster and more efficient ways to process growing volumes of data without increasing capital expenditures.|
|•||Build flexibility into business systems so that new services can be efficiently provisioned in order to capitalize on new market opportunities or comply with emerging regulatory requirements and industry mandates.|
|•||Quickly adjust allocated resources for IT services in order to respond to unpredictable spikes in user demand.|
|•||Drive down IT costs by improving system utilization and IT efficiency.|
Runtime performance has always been a primary concern for the professional developer. Throughout IT’s history there have been numerous techniques, approaches, and even architectures offered to specifically address performance and scalability. While the promise of SOA surpasses that of previous IT architectural models in terms of flexibility, adaptability, and strategic gain, it is known for imposing performance demands. Therefore, with service-oriented application development now on the forefront, performance optimization is receiving more attention than ever.
Some common factors include the following:
|•||Due to the increased emphasis on reuse and agnostic design, services are subject to unprecedented levels of concurrent usage demands.|
|•||The very notion of loosely coupled services infers a messaging-centric communications framework, meaning that applications must now handle not only traditional processing logic, but also message transmission, validation, interpretation and generation.|
|•||As the use of service-oriented concepts proliferates, messaging volumes are expected to explode, adding tremendous load that will push the limits of IT infrastructures.|
Furthermore, the learning curve imposed by these paradigms often resulted in deployed applications that failed to meet performance expectations and, in many cases, required newly developed applications to be prematurely retired.
The Evolution of the CPU
In the past, developers have been able to rely on rapid advances in CPU performance to compensate for the lack of software efficiency in their business applications. With CPU clock speed doubling every 18 months or so, an upgrade of the hardware environment often provided enough of a performance boost to keep up with the growing need for application throughput.
More recently, gains in CPU clock speeds have hit a plateau due to uncompromising physical factors such as power consumption, heat thresholds, and quantum mechanics. Current trends in hardware platforms have thus shifted the focus to multi-core or multi-threaded architectures.
Distributed service-oriented applications, by their nature, take advantage of multi-CPU and multi-server architectures. However, for software applications to truly leverage multi-core platforms, they must be designed and implemented with an approach that emphasizes concurrent processing. This new approach, based on a methodology called software pipelines, can enable businesses to achieve the benefits of concurrent processing without major redevelopment effort.
In order to accomplish dramatic “multiples of performance,” applications must execute more than one task at a time. While this may be obvious, it is not easy to accomplish. Even service-oriented applications, which are already distributed in nature, may be designed to use a serial approach to processing business logic so that the proper order of execution is maintained. It can be difficult to decompose the business logic of the application into a series of steps, some of which can then run concurrently.
Historically, the computer science field has performed extensive research and developed many techniques to accomplish concurrent architectures. Yet, the focus of past research and development concentrates on specific areas that do not easily lend themselves to the transactional applications of today’s business systems.
Therefore, while the need for substantial performance improvements of business applications clearly exists and is becoming more pronounced due to the wide-spread transition toward SOA, existing concurrent processing techniques are either limited in applicability or too complex to incorporate. There are four primary approaches that have been developed for concurrent computing, each of which is discussed in the following sections.
Symmetric Multi-Processing (and Mechanical Solutions)
Mechanical solutions at the operating-system level have no doubt benefited many organizations to date by providing a generic one-size-fits-all approach to concurrent computing. Symmetric Multi-Processing (SMP) platforms automatically split running application tasks among multiple processors on a single physical computer, sharing memory and other hardware resources. This approach is highly efficient and easy to implement, as the application developer needs no detailed knowledge of how the SMP divides the workload.
For SMP to be effective, however, a software application must be written using multi-threaded logic. This is a tricky task that is not generally practiced by corporate IT developers. Furthermore, the tight sharing of resources between processors is both limiting in terms of performance and problematic when applied to business application needs. Shared resources become a bottleneck at some level of scalability, as the necessary locking of resources in this type of system is not optimized for any particular application.
Therefore, a given application may scale well to eight processors, but benefit very little from applying 16 processors to the problem. In addition, resource contention (such as shared software components) can be very difficult to debug within this type of black box environment.
Figure 1: An SMP server operating system manages the workload distribution across multiple CPUs.
Automated Network Routing Solutions
Systems enabled with automated network routing divide application requests using some form of predetermined logic. A common approach is “round-robin” routing, where requests are evenly distributed, one after the next, among a set of physical computers that provide exactly the same application functionality. A good example and use case for this type of concurrent architecture is a Web server, wherein each Web page request is delivered to one of several available processors. While the approach can be useful, it is highly limited as the router has no concept or logic for determining the best path for a given request. Furthermore, all downstream processors perform identical processing tasks.
Figure 2: In a round-robin network routing approach service requests are routed to individual servers in a pool of redundant servers.
Clustering is a widely used technique that allows physically separate computers to share the workload of an application over a network. Clustering provides some capabilities for automatic concurrent processing and is also used to support fail-over and redundancy. In this scenario, redundant resources are replicated across the network, resulting in a highly inefficient approach. Because clustering techniques are automated, they must copy everything from one node in a cluster to another whenever a change in state occurs. Alternatively, they must rely on a centralized resource (such as a relational database), which can become an even more serious bottleneck.
Figure 3: Within clustered systems, multiple servers share common resources over a private "cluster interconnect".
All of these techniques serve a purpose, yet each is limited when it comes to massive scalability, particularly when considering the needs of transaction-based, message-centric solutions. In essence, they can only scale mechanically and automatically to a certain level, at which point the overhead of maintaining shared or redundant resources becomes more of a burden than the resulting performance improvement.
The formation of a resource grid can achieve great scalability by distributing discrete tasks across many machines in a network. In a grid computing environment, it is left to the developer to decide how best to:
1. Divide a single large task into smaller sub-tasks.
2. Utilize the grid environment to distribute the processing.
3. Reassemble the results once processing is complete.
The typical grid architecture includes a centralized task scheduler for distributing and coordinating the tasks with other computing facilities across the network. It has been shown that a grid approach can deliver far higher throughput than the automated approaches described earlier. However, this option can also place an increased burden on the developer due to the previously listed responsibilities.
Most importantly, grid computing has been modeled primarily to solve the “embarrassingly parallel” problem – long-running, computation-intensive processes commonly found in scientific or engineering applications. Typical and productive examples of grid computing applications include the modeling of fluid dynamics, the tracing of the human genome, and complex financial analytics simulations. Each of these application areas has the common characteristic of dividing a massive, long-running computation among multiple nodes, decomposing the problem into smaller, similar tasks that tend to behave in a predictable manner when considering computational resources.
Figure 4: Within a grid computing platform applications are divided into sub-tasks that can execute independently.
Limitations of Traditional Concurrent Processing Approaches
There are three primary reasons that service-oriented business applications do not lend themselves to traditional concurrent processing techniques.
1. Order of Processing is Critical
Business logic must be performed in a specific sequence to ensure the integrity of the business process. In many cases, applications implement a “first in/first out” (FIFO) queue by waiting for each transaction to be completed before the next one in the queue is processed. For example, a billing application cannot compute the total cost of a bill before it has looked up the rates that apply to the customer and computed subtotals for each different category of services. For a mobile phone bill, the business logic would need to know the total daytime, evening, and weekend minutes before it can compute the bill total.
This order of processing is difficult to maintain in a grid computing environment. While SMP systems are designed to ensure order of execution (unless the application is written with multi-threaded logic) there can be significant performance problems when the volume of transactions reaches a critical threshold.
2. Centrally Shared Resources Create Bottlenecks
Although services are ideally designed with increased autonomy, in most environments service architectures involve the usage of a shared database or other centralized resource. In a typical concurrent processing environment such as an SMP server or a grid infrastructure, the centralized resource presents a bottleneck that limits application throughput. Resource contention eventually creates a performance problem if transaction volumes continue to increase.
3. Unpredictable Behavior and Resource Needs
Compared to a massively parallel scientific application, the average service is much less predictable in its runtime behavior and resource needs. The size and processing requirements of business workloads can vary greatly throughout the day or even within a given hour. This not only makes it more difficult to divide service logic into equally sized components in terms of processing time required, but it also means that allocation of resources must be flexible enough to dynamically respond to the resource requirements of each component.
Introducing Software Pipelines
Software pipelines introduce a new concurrent processing methodology that provides a simple way for service-oriented business applications to implement concurrency while maintaining order of execution priorities and simplicity of application development.
The software pipelines architecture supports peer-to-peer scalability and the decomposition of business processes into specific tasks. These tasks can be subsequently executed in parallel, while the overall workload can be balanced across the resources within one or more servers. It also provides a means by which developers can control the distribution and concurrent execution of various tasks or business process components.
The software pipelines architecture is designed to handle a high-volume stream of transactions, both large and small, and, thus, is ideal for mixed-workload application processing.
Figure 5: Software pipelines are front-ended by a pipeline distributor that routes service requests.
Within a service-oriented solution, software pipelines can be used to group transactions or business logic for which order of execution or priority must be preserved. For example, each customer of a bank might be associated with a specific pipeline. The pipeline would then execute all computations and transactions that relate to the specific customer and preserve the order of execution for transactions that relate to that customer. Other customer transactions could then be executed on different pipelines, which could process those transactions without regard for the order of the first customer’s transactions.
For customer billing transactions there is no concern about whether one customer’s transactions are completed before another. Different customer bills can be processed concurrently on separate pipelines while transactions that relate to a single customer bill are processed sequentially within the same pipeline.
Controlling Pipeline Flow
To implement the software pipelines approach, a pipeline distributor is needed for sorting service requests from the business application into appropriate pipelines and for balancing the load across multiple software pipelines. This component is co-located with a pool of pipelines, and effectively front-ends incoming service requests as shown earlier in Figure 5.
Pipeline distributors route service requests by evaluating message content and in response to configuration rules that can be easily modified without changing individual business services. These configuration rules can be established and modified to distribute workloads and to optimize throughput via concurrent processing according to priority and/or the order of input (FIFO).
This design approach enables scalability in two dimensions. Additional pipelines can be added under a given pipeline distributor, and when a distributor has as many pipelines as it can effectively manage, more pipeline distributors can be added. When more pipelines are added to a system, the capacity for managing additional transaction volume grows proportionally. Furthermore, workloads can also be moved between pipelines to avoid bottlenecks.
Working with Software Pipelines
One of the major advantages of the software pipelines approach is that it is simple to implement. Multi-threading an application can require that developers dissect the application into discrete components that run concurrently without disturbing the business logic. With software pipelines the business logic can remain intact and the sorting into pipelines can be based on unique identifiers (such as customer IDs), which are often already maintained as part of the business logic.
Software pipelines furthermore enable developers to:
|•||implement concurrency only in the performance-critical portions of the business logic|
|•||enforce FIFO only where required|
|•||control distribution of system resources to help maximize utilization|
The business rules that are used by pipeline
distributors to sort service requests can be modified to tune
performance and redistribute workloads. Pools of pipelines can also be
allocated to a specific hardware resource and can be moved to take
advantage of new hardware resources as they are added to the computing
SOA has delivered big gains in terms of flexibility and adaptability, but has left some organizations wondering whether a service-oriented solution can meet their business application performance needs and expectations. Today’s IT organizations are looking for more efficient ways to process rapidly growing volumes of data and take better advantage of new multi-core hardware platforms.
The software pipelines methodology offers a simple and easy method for business applications to exploit concurrent processing without the large development investment required to add multi-threading to a traditional monolithic business application. They can enable service-oriented applications to support higher transaction volumes within existing IT budgets and help ease the strain on data center capacity.
Finally, this approach to performance optimization results in tangible business benefits, including the reduction of capital expenditures, increased business requirements fulfillment, and more cost effective development processes – all of which are in alignment with the key strategic goals of service-oriented computing.
This article was originally published in The SOA Magazine (www.soamag.com), a publication officially associated with "The Prentice Hall Service-Oriented Computing Series from Thomas Erl" (www.soabooks.com). Copyright ©SOA Systems Inc. (www.soasystems.com)