Interfaces of all kinds are subject to change. When we make a change to the operations or data offered by a service contract, this change will have an impact the consumers of the interface. In the world of SOA and service-orientation, attention to interface - or service contract - versioning is paramount because the average service-oriented enterprise ends up establishing many more dependencies on a published technical interface than traditional, silo-based application environments. One change can therefore have a ripple effect by impact numerous service consumers, each of which may have been composing the service for a different purpose. This article explores common service contract versioning challenges and proposes different approaches with an emphasis on helping you achieve "extreme" XML compatibility. This article authored by Ronald Murphy and published at SOA Magazine on May 2009.
Introduction: Why Compatibility Matters
To make a change "backwards compatible," we try to ensure that the data and operations already being used will continue to work. If this doesn't happen, any existing coded uses will need to be updated - usually meaning rebuilding significant parts of the service logic. The first-order effects of this consequence are consumers that don't appreciate the inconvenience you have caused them - and this effect only gets worse with repetition!
There are a number of second-order effects as well:
|•||You can't usually expect consumers to upgrade instantly. Many of them operate on their own fixed release cycles. The more service consumers you have, the more the diversity in the release cycles.|
|•||Some consumers won't have the technical or business capability to upgrade at all. They either don't have resources, or can only change their consumer programs on narrow cycles, or can't cost justify the change. You will lose these consumers unless you keep older versions of the service contract available.|
|•||Keeping older versions of your service active usually means duplicate copies of a lot of code - which have to be maintained in parallel. If your service contract is part of a Web service, you may have to maintain duplicate decoupled contracts on multiple machines or pools - adding to operational costs. To debug problems, you then have to have multiple environments running. When you perform regression tests, you have to regression test all versions. When you fix bugs, you may even be making changes to multiple copies of the service logic code.|
|•||In many programming environments, you can't even combine multiple versions of a technical contract. For example, in Java, incompatible versions of the same Java class or Java interface cannot be combined in a JVM without classloader trickery. Similarly, many Web service clients can't easily manage multiple versions of the same XML Schema. Suppose two services share a set of types (for example, in a particular XML Schema), but at different version levels? Without compatibility controls, type versioning, or special mapping, these two services can't be used together at all in the same consumer program.|
The more different instances of services we combine together, the more complicated the compatibility matching game gets:
|•||Suppose two products depend on different versions of your product interface. No one can directly combine these two products.|
|•||Suppose version 3 of your product, B, depends on version 2 of product C. One of your users, A, depends on version 3 of your product, but version 1 of product C. The combination doesn't work and you are all out of luck.|
To address compatibility breaks, you may have to distributed multiple versions of your service contract, and maintain compatibility matrixes of all the products you are associated with. Then, your regression tests and potential debug environments expand "combinatorially."
Basic Compatibility Rules
In the Internet era, we have had a chance to think about the backwards compatibility of distributed systems. During the development of Web architecture, some simple principles were gradually codified in various standards, most notably the "must-ignore" rule of HTTP and HTML [REF-1]. The blogosphere has further developed this into an informal mathematical theory [REF-2] and some practical guidelines for use in the XML and Web services world [REF-3].
The essential formula for ensuring technical contract compatibility is a set of expectations of both service providers and service consumers:
|•||Providers must only add to the data they emit. They should never stop emitting expected data. This applies to both consumers issuing request data, and services issuing response data. This is what we normally think of as the backwards compatibility obligation of service contract providers.|
|•||Consumers must have a policy of ignoring any such additions such as new, unknown XML elements in a type. This "secret ingredient", the must-ignore rule, maximizes the "forwards compatibility" or "future-proofness" of service consumers in particular.|
For Web service developers, a useful strategy is to have your latest service version understand as much as possible of the syntax and semantics of its past versions. In fact, if the service knows the version of a requesting consumer, it can actually cater to that version, like a multilingual customer service representative. To the degree services keep knowledge of all available versions, we reduce the very need for must-ignore behavior on the service end, and that issue - and a lot of extended compatibility issues - are reduced to being a client-side problem.
XML Parsing Strategies
Service developers have a choice of approaches to XML parsing, which have at least two broad variations in style:
A programmer writes code such as a DOM traversal to navigate data element by element. Data types are manually interpreted (Integer.parseInt(), etc.) based on documentation such as XML schema. (If you regard this as a primitive approach to be dismissed, consider the legions of Ajax, Perl, and PHP clients out in the world!) Followers of this style are advised to adhere to the must-ignore rule - don't throw errors if you find an unexpected element!
The UPA (Unique Particle Attribution)
The UPA constraint is a rule to keep XML Schema grammars unambiguous such that any given point in a parse stream is resolvable only by one XML Schema definition element or "particle" such as an "element" definition, a "choice" definition, an "any" definition, etc. The removal of ambiguity creates more rigorous grammars and avoids the need to build a "look-ahead" parser that considers the forward context of a parse in order to resolve particles that are ambiguous when considering only already received data. A typical ambiguity appears if you follow an optional element (such as ) with an "any" wildcard. If "myElement" appears in the output, it can match either the optional definition or the following "any". Human users usually have much less trouble with this situation than parsers.
Automatic XML Schema-Driven Parsers
Many parsers will match XML schema types to corresponding code-generated programming language objects. A document is automatically mapped to a tree of programming objects. In some platforms such as .NET, the parsers themselves honor the must-ignore rule. In other cases such as Apache Axis 1.x, they do not. Runtime XML validation often not applied since it must be specifically enabled in the parser and tends to slow performance. If validation is applied, typically the schema is evaluated strictly, and the must-ignore rule is not usually heeded at all - a potentially big problem for compatibility.
An XML Schema is a type contract which defines the set of XML document values available to users (consumers) of the contract. In some settings, this is meant to be binding in the sense that any values outside this set should be rejected. But in Web services, it is best to regard XML Schema as specifying the values known to work for a given version of the schema consumer (a service consumer receiving a response, or a service provider receiving a request). The values of some other related XML schemas from a different version may still be compatible. In fact, it is possible to design schemas with the specific goal "extreme XML compatibility."
The strictness of some XML parsers about extra data - not keeping to the must-ignore rule - can be one significant initial problem. A somewhat common workaround [REF-4] for this is to reserve wildcard placeholders (xsd:any); this practice is rather controversial and in particular is difficult to follow while keeping to the so-called UPA constraint of non-ambiguity [REF-5]. One radical approach is to discard the UPA constraint; there are also limited workarounds possible [REF-6].
What is the consequence of violating the UPA constraint? The constraint itself was controversial during development of XML schema; however it was adopted in the interest of enabling the greatest variety of parsers, and keeping parse logic simple and lightweight. In practice, violating the constraint does not seem to confuse any of the parsers that are in wide use. XML schema validators will complain, for example, .NET issues warnings for UPA constraint violations. However, even validators will often function, and validation is not a prominent implementation part of many real world Web service environments because of the performance cost of performing validation at runtime. In future implementations, this performance factor could shift, but the jury is still out.
XML Schema Evolution
Here, we examine the compatibility impact of various schema changes on consumer-to-service interactions. We assume that a service makes changes to a schema, evolving it to a latest version.
The upcoming backwards compatibility assessments are from the point of the service contract, and refer to whether older consumers can successfully communicate with the new service, such that:
|1.||Request data sent by the older client is still understood and accepted by the newer service.|
|2.||Response data sent by the newer service is still understood and accepted by the older client.|
|•||Addition of new (distinctly named) types is always syntactically and semantically backwards compatible, because the mere knowledge of new types is able to be completely ignored by all consumers and providers. (Deleting any types that are not referenced is also backwards compatible.)|
|•||Addition of new elements in a type can be syntactically backwards compatible, only if the client schema was already coded to anticipate these elements. For a simple example, suppose a service declares an element as maxOccurs="unbounded", and actually happens to emit three instances of this element in the first version. In a later version, the service expands the output to emit five instances. Since we've coded for any number of elements, the schema is flexible to the server-side change. Unfortunately, many additions to types are new unrelated elements, not new instances of the same element. To simulate "must-ignore" against these unknown elements, one can code in wildcards as discussed above; however, this must be used very carefully if the designer wishes to honor the UPA constraint.|
|•||Addition of new elements in a type is semantically backwards compatible if the element has optional semantics. That is, if a new input argument is offered to clients, the older clients must not be required to send this argument; it must be considered optional in the new schema and some default must be assumed. If a new output argument is sent by a service, an older client is required to semantically ignore it, based on the must-ignore rule. (Of course, the client probably has no code to deal with this unknown quantity anyway.)|
|•||Addition of new attributes on an element is technically backward incompatible, but does not generally seem to pose a problem for widely used XML parsers.|
|•||Deleting an element in a type or an attribute on an element is backward incompatible; however, note that if consumers are not using the data, there is usually no practical problem with the deletion. We exploit this in our "paddle wheel" strategy below.|
|•||Addition of new values in an enumeration is backward incompatible. This surprises many initial users of Web services. .NET and Axis 1.x will reject unknown enumerations, and this can cause clients which are processing responses to experience a hard failure (client-side programming exception) in backward incompatible usage situations. Enumerations which are used only in client requests, not in service responses, are usually safe to expand with new values, since the service usually has the superset of all the values.|
|•||Deleting enumeration values is backward incompatible. For example, in client request data, older clients might send the deleted values to the service, which no longer knows about them and will probably throw a parse exception. Deleting of output-only enumeration values is generally benign to older clients since they simply will never see an instance of this value.|
|•||Changing type of elements from one type to another is almost always backward incompatible. There are some subtle exceptions that may not cause much trouble, such as widening a service response float type to a double, or changing between closely related types (string / token). However, such practices must be considered very carefully.|
|•||Changing constraints (e.g. multiplicity) on the type of elements is a form of type change. As such, it is generally backward incompatible - where the restriction occurs on the consuming side. For example, suppose a response-side element is initially restricted to three instances (maxOccurs="3") and clients are coded to this. The service is later updated to emit up to six instances and the schema changed to maxOccurs="6". Some older clients will enforce or assume the old constraint, either syntactically or semantically (in business logic that instantiates arrays of size 3, etc.). Thus, constraints tend to have the effect of impairing the evolution of the types that they decorate: They "freeze" the type along the particular "dimension" they are controlling, such as multiplicity, maximum string size, available values (enumerations), etc. Note that even as simple a constraint as required / optional (minOccurs="0" vs. "1") can have this effect!|
|•||Instantiating an element of a base type with a derived type not known to the consumer is backward incompatible. This means that the compatible use of polymorphism and dynamic type determination (via xsi:type) is limited to a set of known types declared in a particular schema. (There are other reasons not to use polymorphism at all, particularly complexity for manual parse coding, and implementation limitations in some parsers.)|
|•||Changing the ordering of sequences is technically backward incompatible, although most parsers in general use seem to tolerate this. Complicating the matter, XML Schema does not have any practical "unordered" alternatives: For example, the xsd:set construct was reported to have poor support in many early XML schema based parse environments. For best results, you should retain the order of your defined elements from schema version to schema version.|
Recommendations for Smooth XML Schema Evolution
In considering the above guidelines for XML Schema compatibility, it is clear that additions (other than enumeration values) are generally quite safe, while most kinds of type changes, and most deletions, will cause backward incompatibility. This leads to a set of principles for extreme compatibility of XML Schema:
|•||Evolve your XML schemas by adding new elements and types.|
|•||Find a way to code client schemas to allow for new elements. For maximum compatibility, make liberal use of xsd:any - and choose an approach toward handling the UPA rule.|
|•||Keep constraints on your existing elements loose. Although this will remove some of the power of XML Schema as a typing language, you are still controlling many aspects of type, such as complex type structure. Tightening the syntax screw to the last degree will indeed ensure a given version of schema has maximum intolerance for bad data, but it will unfortunately also impose intolerance for future "similar" schemas along the evolution path of your data. This means that clients and services at differing schema levels are significantly more likely to have compatibility problems.|
|•||In complexType definitions (or any global element definitions), Don't change the type or semantics of elements from one version to the next. Instead, introduce a new element with the new type, and keep the old element around until you have a chance to deprecate and obsolete it.|
Service Contract Evolution Models
Armed with an understanding of how XML Schema compatibility works and the guidelines for maximizing compatibility, we turn now to interface evolution as a whole. The comments here actually apply beyond XML Schema based contracts such as those used by Web services; many contract such as APIs have similar concepts.
The two most common service contract evolution models are the following:
Each new feature is done following the basic compatibility rules. In general, the data, operations, and messages/events of the new feature are a superset of the old; older features, data, etc. are not retired from the service contract schema definitions. Eventually, certain older features may be rejected in logic using business errors, but syntactically the features are still there. Advantages of the ramp approach are:
|1.||It is very easy to maintain; there is minimum disruption to customers.|
|2.||Many features develop along evolutionary, backwards compatible lines for the most part, so this approach tends to mimic what we do at the programming level.|
|1.||Deprecated or obsolete data imposes clutter on the contract.|
|2.||For areas of a contract where there are major changes in semantics, you can confuse people who see the old features.|
Periodically, an all new version of the service contract is developed. No features of the old contract are directly compatible with the new contract; you must upgrade and convert all your client logic to use the new contract. (Some or many types in the new contract version may have the same structure and meaning as those of the old version, but they are still considered as distinct types, and mapping is usually required to convert data of the old service contract's types to data of the new contract's types.) Advantages of the staircase approach:
|1.||For major changes in semantics, this clarifies that the client will be working with an entirely new kind of contract.|
|2.||It becomes possible to simultaneously reference the two generations of service contracts, without ambiguity.|
|1.||Clients will have to adjust to an all new contract, possibly converting or mapping every data reference in the application from the old type to the new one. [This can be mitigated by sticking to parallel structures/meanings and keeping the type change subtle, as with a namespace; but there will still be some impact in many environments.]|
|2.||Service contract designers are actually encouraged to ignore compatibility. While this can be freeing to the design in the case of a major upgrade situation, it can also lead to wanton adoption of change for no good benefit.|
|3.||We see all the incompatibility consequences cited in the first section of this article. Usually, a transition plan is needed in which two or more versions of the contract are kept available so that clients have some time to adapt to the changes. This may be a short duration while deploying the new service, or it can be an extended period - problems or limitations might exist in the new contract version, or clients might refuse to upgrade. Keeping multiple versions of services increases developer code maintenance costs and can increase operational costs a lot (parallel pools of machines, multiple endpoints, etc.)|
In practice, it is actually possible and common to combine the above two models:
|•||The ramp strategy is used for related "minor" updates of a service contract. The contract is kept backwards compatible for as long as the business policy dictates this: As long as practically possible given the feature evolution; or until a fixed time period has elapsed; etc.|
|•||The staircase strategy is usually employed for "major" updates of a contract, as an occasional interruption to a stream of ramp updates. Where an entire contract must be broadly upgraded with significant changes throughout (e.g. widen a large number of data types from 32 bit to 64 bit), this may be the only practical choice. For some features that tend to evolve in a very evolutionary way, the "major" update may never be needed, or there may be so many customers that major updates can only be done very rarely and with a long transition window.|
In many environments, some backward incompatible changes are at least occasionally needed (e.g. cleanups of deprecated types), but these are localized to only certain feature areas. This leads us to a refinement that fits our "extreme compatibility" slogan...
On a given release point, some features are added backward compatibly, and a minimal set of features is changed backward incompatibly. The service smoothes out differences between versions by giving clients the data they need, giving an overall effect of backward compatibility.
Particularly for larger service contracts, the paddle wheel strategy can be very effective:
|•||It removes the "ever expanding universe" problem of the ramp strategy.|
|•||It increases overall compatibility over time, and reduces the number of incompatible pairings of clients and services - and since each of this will be a potential operational issue, we are increasing quality by avoiding these issues.|
|•||It usually allows you to avoid keeping multiple service versions in multiple pools. The cost of one service implementation being version "polylingual" is generally much lower, since incompatibilities are localized and only certain parts of the service code must switch various feature sets in and out based on client version.|
Versioning Enforcement Techniques for XML Schema
Versioning enforcement is itself an aspect of compatibility. In the above strategies, we usually have the following goals:
|•||For staircase evolution, we want versioning to be enforced: Since contract versions are explicitly mutually exclusive, we should not allow data of one contract version to be accepted in the context of another version.|
|•||For ramp and paddle wheel evolution, we want versioning of data in the "compatible area" to be advisory only, not enforced. If data is supposed to stay compatible across versions, we violate this by enforcing version differences (rejecting the data of another version).|
The following changes to Web services will cause enforcement to be strict:
|•||Renaming a service, or changing the service endpoint. (These may cause only minor changes in clients, but they do tend to cause some degree of change and redeployment) These should be avoided for ramp and paddle wheel evolution.|
|•||Changing the schema namespace. This should be avoided for ramp evolution. It can be used for paddle wheel evolution, only if a segment of the schema can be collectively targeted to a new namespace. This will have the effect of increasing the total number of namespaces, which can cause extra complexity for some types of clients.|
Renaming services or using new schema namespaces can be employed as a way of enforcing staircase evolution, if desired.
Namespaces: A Closer Look
In strict XML Schema doctrine, most changes to a type result in a different type - even additions, which are disruptive since there is no automatic wildcard or must-ignore provision for them. Some designers take this notion of type difference to an extreme, by either renaming each changed type within the same namespace, or transferring the latest type to a new namespace. There are some advantages to this approach - we'll look mainly at the specific practice of using namespaces to distinguish type versions:
|1.||The old and new types can't be confused. In XML schema compiler environments, they will translate to different programming language types (usually the same type name in a different package or programming namespace). This means that the types can be combined and used together in one programming environment (such as a single JVM).|
|2.||In the namespace approach to versioning, the version of any particular type in use is quite clear. Each element will effectively carry the version over the wire, as its namespace prefix.|
Unfortunately, there are some practical disadvantages:
|1.||As we mentioned before, a direct change in type creates a compatibility break - we are following the "step" methodology for all items directly or indirectly changed, and we're likely to have to maintain all active type versions in all their namespaces. There is typically a mass "copy-paste" such that an entire hierarchy of elements of various types is cloned into a related hierarchy with the new types. Where consumers want to mix and match the types of the various versions - as when business logic must populate output data corresponding to different versions - then we must typically code a mapping layer which targets or translates the right variation of a particular data set.|
|2.||Even worse, namespace changes cause a ripple effect throughout the type dependency graph. Suppose you have a network of related types in different schemas and namespaces. Each time you change any schema's namespace, all depending schemas are affected incompatibly - the elements they reference are of a different type, by virtue of the namespace change. So if you're handling schema versioning of these schemas through namespaces, you need to update all their own namespaces too. Even if there was only one stray reference to some type that you want to change or remove, you've now done a mass version update which is backward incompatible. So clearly, you want to reduce this phenomenon to major changes in your type system. This is significant for large-scale development in particular. If you have many interrelated projects on different schedules producing different schemas, it is not a good idea to version these schemas using namespaces.|
|4.||The complexity and data volume of wire data tends to increase with the number of namespaces. Complexity shows up in areas such as testing, extra support calls from customers, more intricate documentation, and extra learning curve for customers and your own staff.|
|5.||Some simpler clients actually don't deal with XML namespaces at all - they are namespace "unaware". You run the risk of breaking these clients.|
Because of this, we recommend that changing namespaces of types in a schema should only be used if you are pursuing a step evolution strategy - and hopefully your incompatible steps occur once a year or less.
In this article, we've looked at compatibility from a practical standpoint, with some definitions and rules that apply to XML Schema and Web services. Our analysis of schema evolution strategies led us to a form of "extreme XML compatibility", the paddle wheel strategy. Although other approaches can work well in your environment, the paddle wheel strategy seems to optimize a wide variety of trade-offs including overall backward compatibility of a service contract, the amount of effort required to maintain an evolving Web service, and ease of use for clients.
[REF-1] See overview of the historic rules in http://www.pacificspirit.com/blog/2004/01/28/extensibility_and_ignore_rule_in_web_architecture
[REF-2] Three-part series starting with http://davidbau.com/archives/2003/12/01/theory_of_compatibility_part_1.html
[REF-4] http://www.xml.com/pub/a/2002/11/20/schemas.html ; http://dev-forums.ebay.com/thread.jspa?threadID=500001723
[REF-6] http://www.w3.org/2001/tag/doc/versioning-xml, section 7.4, and http://msdn.microsoft.com/en-us/library/ms950793.aspx
This article was originally published in The SOA Magazine (www.soamag.com), a publication officially associated with "The Prentice Hall Service-Oriented Computing Series from Thomas Erl" (www.soabooks.com). Copyright ©SOA Systems Inc. (www.soasystems.com)