BPEL is complex and BPMN is simple, right? After all, BPMN has a nice graphical notation. The BPEL standard only specifies what the language looks like in XML. That alone ought to be enough claim the prize for BPMN.
However, what if you use BPMN’s notation for a process but use BPEL for the executable representation? This removes the graphical vs. XML distinction and can "hide" the non-graphical BPEL as represented in XML. You end up with a BPMN model everyone can understand and a BPEL model your computers can execute. It's like the two sides of a coin: there are different pictures on each side, but the coin itself is always both sides at once.
However the question of which is simpler gets more complicated when you also consider that the new BPMN 2.0 specification includes hundreds of constructs in its meta-model that have no graphical representation. Now, which is simpler, BPMN with BPEL or BPMN with the new BPMN 2.0 execution language? What may seem obvious (BPMN with BPMN 2.0 execution) isn't the slam-dunk choice many people might expect it to be.
BPMN 2.0 has two different -- but equal -- compliance points for execution: BPEL Process Execution Conformance and Process Execution Conformance. This means that BPMN 2.0 standardizes the use of BPEL as the execution language for BPMN, but it also offers the option of making BPMN executable by using new constructs that have been added to the BPMN notation specifically to support execution. These new constructs depend on the execution semantics that have been defined for almost everything in BPMN.
So, which is simpler? Believe it or not, using BPMN with BPEL execution is dramatically simpler than trying to execute processes using the new BPMN 2.0 execution language. I know this sounds counter-intuitive, so I will justify it in this post and a series of follow-up posts on the same subject.
Before I get into the details of why I believe BPMN with BPEL is better, a little history might help clarify the question. There are some factors that caused the BPMN 2.0 standard to eventually become more complex than BPEL. (I know, I know, BPEL has the reputation of being far too complex...but hear me out.)
BPMN was designed to be a language for communicating from one person to another, not from a person to a machine. Languages used for human communication have a natural, and appropriate, tendancy to grow. Whenever people find that they frequently need to convey something that is awkward to express with their current vocabulary, they invent a new word. English, which is especially amenable to such growth, surpassed one million words last year. Just consider "unfriend" or "netbook," new words to express new ideas.
The same is true for graphical modeling languages. Look at UML (Universal Modeling Language). It started as the unification of three fairly simple graphical notations (best known by their respective primary inventors: Rumbaugh, Coad & Yourdan, and Grady Booch). Once they unified their modeling languages and people started using them in earnest, they grew larger and larger, with new diagrams and new elements on those diagrams with each successive version. Sure there was always overlap in what could be expressed by different diagrams or different elements, but in each case, there were situations where one was more natural to the reader than the other. The fact that different constructs have imprecise overlapping meanings is of little concern in a language meant for people, since people are comfortable with choosing among a variety of ways of expressing the same thing, each with their own nuances and connotations.
But while notation creep is a useful way of expanding spoken languages or graphical notations, it is not such a good thing for a language that must be directly executable on a computer.
That's because it is always a problem to take such a large language and give it formal executable semantics. The problem usually isn’t with a lack of rigor in the definition of any one construct. The problem is with the exponential number of combinations of those constructs.
Good programming languages typically add new fundamental primitives very cautiously. Consider how much hard preparatory work was done in the Java community before Java introduced generics into the language, or the hand wringing that is gripping that community as they grapple with the addition of closures to the language. The way it typically works is that some eminently-respectable, highly-credentialed expert (like Neal Gafter, in the case of closures) will make a seemingly very well-thought-out proposal that describes how the new construct will simplify the lives of so many programmers. Then another equally eminent expert (like Josh Bloch, in this case) will find unintended consequences of the new construct when it is used in combination with other things in the language.
That was just for one language feature. The BPMN 2.0 execution language has dozens of features that have never really been used together in an execution language. For example, the BPMN 2.0 execution not only has a variety of ways of handing the control flow for multiple incoming sequence flows, activities also can’t execute until all of the required inputs from one of the activities input datasets has become available. In other words, it has a fairly complex data flow model intertwined with its control flow model.
Another example is message correlation. BPEL has, in the past, been criticized for the complexity of its approach to correlation, but BPMN has two different correlation mechanisms. Key-based correlation is basically equivalent to BPEL’s correlation mechanism, although the standard has invented all new terminology for the various components. It then defines a new concept of context-based correlation. Rather than trying to convince you that it is complex, I’ll just include the complete explanation of it from the BPMN 2.0 specification (yes, in a 500-page specification, there are no examples or additional explanations for these concepts):
In context-based correlation, the Process context (i.e., its Data Objects and Properties) may dynamically influence the matching criterion. That is, a CorrelationKey may be complemented by a Process-specific CorrelationSubscription. A CorrelationSubscription aggregates as many CorrelationProperty-Bindings as there are CorrelationProperties in the CorrelationKey. A CorrelationPropertyBinding relates to a specific CorrelationProperty and also links to a Formal-Expression which denotes a dynamic extraction rule atop the Process context. At runtime, the Correlation-Key instance for a particular Conversation is populated (and dynamically updated) from the Process context using these FormalExpressions. In that sense, changes in the Process context may alter the correlation condition.
Confused yet? Are you wondering not just why BPMN 2.0 needed to define and redefine an important concept like message correlation, but also wondering how, precisely, to implement BPMN correlation?
These are just a couple of the ways that BPMN’s new execution language is more complex that using BPMN with BPEL. BPEL is now a known commodity. It's widely implemented. Many production applications are running BPEL today. There are many people with experience with it and the concepts in the language are well understood. With BPMN 2.0, it now has a standardized notation, so there is no need to work with a new language that is a big bag of language constructs whose interactions have never been exercised together.
(This article is adapted from my original post on the VOSibilities blog at http://www.vosibilities.com.)