The Rich Engineering Heritage Behind Dependency Injection
(The rest of this article discusses DI from this perspective. Apologies to advocates of other techniques, including annotation-based approaches which I'm aware that Spring also supports. I have no real experience with these although some of the following will still be applicable).
Spring has led the way to mainstream acceptance of DI, particularly for Java server side architectures. DI has proven its worth as a way of making the components of a system and the architecture explicit. It's also very useful as a way of unit testing components.
So what is the engineering heritage behind DI? Did it just appear out of nowhere? Is it a passing fad, something to be replaced when the hype cycle recedes, a reaction against the horrors of EJB JNDI overuse? Or does it have a sound basis in software engineering theory? If so, what can we learn from this?
Well, to academics familiar with the research behind software components over the last twenty years, it is clear that DI is very closely related to a well-established area of software research known as architecture description languages (ADLs). The job of an ADL is to assemble or wire components together via configuration.
In fact, the Spring bean configuration language is (technically speaking) a non-hierarchical ADL with implicit connectors. In this article, I'll briefly compare Spring configuration and some of the older approaches, showing how similar they are. I'll also show how ADLs developed, giving a unique insight into possible future developments in this area.
The bottom line is that Spring DI represents mainstream acceptance of the component approach used by ADLs. DI is here to stay, and will get more important as it is applied to larger systems, more varied domains, and applied at a finer-grained level. This is because it represents a principled and academically sound way to assemble software from components. In addition, the fact that DI was largely developed independently of any ADL research provides powerful practical validation of the concepts from both a practical and academic point of view.
I do research into software components as part of my PhD, having created my own experimental ADL called Backbone, which focusses on representing system evolution. My supervisors (Professor Jeff Magee and Professor Jeff Kramer) created Darwin (in conjunction with other researchers), one of the better known ADLs, back in the mid '80s. As such, this article is an intentionally biased view towards the academic literature.
I've also worked with Rod Johnson on a previous commercial project, although I have no affiliation with SpringSource and I haven't contributed to Spring in any way either conceptually or via code. I currently use Spring.Net on a commercial project.
(Many thanks to Rod for reviewing and commenting on a draft of this article)
What is a software component?
This can be a seriously hard question to answer, and if you ask ten different developers what a component is, you are likely to get at least ten different answers!
The intuition behind software components, however, is quite easy to pin down: making software should be like wiring together electronics components. Components should be interchangeable, tangible units which are clear about which services they provide and which services they require. In short, we want to wire up components together to make a software system. We also want to take existing components out of a system, and wire in new ones as long as their interfaces are compatible.
This is the intuition that many people get from reading possibly one of the most influential papers on the subject – Professor Doug McIlroy's address on Mass Produced Software Components, delivered to the NATO software engineering conference. This address was given in 1968!
Let me then offer a simple definition of a software component distilled out of the history of ADLs:
A component is a unit of software that can be instantiated, and is insulated from its environment by explicitly indicating which services (via interfaces) are provided and required.
(Note that this is a minimal definition, taken from academic literature. In particular, it avoids mentioning features such as language independence, distribution and deployment which are obviously valuable, but not strictly necessary for a component system)
This is a surprisingly simple definition, but it is also profound, and it has taken a long time for people to achieve any level of consensus on this. A Java class definition generally satisfies the “provided” part of the equation (via interface implementation), but fails as a component because it doesn't have to explicitly indicate what interfaces it requires from other components. It could depend on concrete classes, or all manner of other things.
Interfaces are really the key here. They allow components to be substituted as long as the interfaces match up.
Further, unlike a module, a component can be instantiated. In general, most ADLs allow components to be instantiated as many times as required, like how class instantiation works.
So that's really it at the Java level. A Java class can be used as a component as long as there is some convention or restriction which constraints it to be explicit about what interfaces it requires. This is how Spring bean configuration works, and the mainstream acceptance of this technique represents an important milestone which should not be underestimated.
Below is the UML representation of a Doubler component that receives a character from a method on the provided Input interface, and sends it to its two required interfaces. Next to the UML picture is the equivalent Java code, which perhaps shows the situation more clearly. The “required” interface fields should only be set by the ADL / DI mechanism.
Spring's DI doesn't require that simple components like this be configured. It can work with Javabeans conventions, or use constructor injection or annotations which are specified directly at the code level.
Leaf and Composite Components
So Java classes can be used as components. However, if that is all there was to ADL components, then a powerful concept would be missing: the notion of composing increasingly higher level components from lower level ones. This is known as a composite component. A Java class is a leaf component – it cannot be further decomposed into other component instances.
A composite component is formed by wiring up instances of other components. Technically, it has no intrinsic behaviour of its own – this is all provided by the instances. This wiring up then becomes a new component in its own right, and any internal details and connections can be hidden.
Consider a graphical view of a composite component called FancyDisplay:
The FancyDisplay composite contains four displays, and it uses three instances of the Doubler component to fan out the input to the Display component instances.
Note that FancyDisplay has no Java counterpart but only has an ADL representation. This is because it is just wiring instructions for how to connect together the component instances inside it. In Backbone (the ADL that I developed for my PhD work), the definition would look as follows:
inp provides Input;
Doubler d1, a, b;
Display a1, a2, b1, b2;
c1joins input1@a to d@a1;
c2 joins input2@a to d@a2;
c3 joins input1@b to d@b1;
c4 joins input2@b to d@b2;
c5 joins inp to inp@d1;
c6 joins input1@d1 to inp@a;
c7 joins input2@d1 to inp@b;
This facility to compose instances into another component is (predictably enough!) known in the ADL literature as composition. It's a very simple concept, but very profound, because it means that at whatever level you look at a system, you will just see components wired up together. A software system is then fractal-like in nature – the same form at any level of abstraction.
Composition is a powerful way to hide complexity. At the top level, an entire system may be represented by a single component which internally contains N other component instances wired together. If you want more detail, just zoom into each of those component definitions and look at their internal wiring and so on until you hit the leaf components.
Spring is slightly unusual in this respect. A Spring bean must be associated with a class, so technically speaking it does not offer composite components, at least not in the way that most ADLs would understand them. However, it's fairly obvious that you can get a level of composition, because one bean can then become a new component to be wired up inside another bean.
To support composites fully, Spring would have to allow a bean definition to be specified without an associated class definition. The benefits are that it simplifies analysis and system construction. To do this, Spring configuration would need to add explicit connectors and ports. I'll cover the former, but not the latter in the next section.
(Rod Johnson has pointed out to me that similar effects can be achieved in Spring with the new Spring configuration namespace features. I haven't yet explored this in depth, but I lean heavily towards including direct support for composite idioms)
One feature that distinguishes ADLs from earlier technologies such as module interconnection languages is connectors (see c1 to c7 in the previous diagram). A connector literally “wires” two components together. Making connectors explicit means that much more complex structures can be wired together using the ADL.
Spring has implicit connectors, allowing a bean's property to be set to an instance or reference of another bean instance.
The composite example of FancyDisplay above is easy to express using explicit connectors (the “connectors:” part of the textual definition), but difficult to do if the connectors are implicit. How would c1 through to c4 be expressed? Spring XML configuration could not wire up the composite component in a single definition. For simple examples, such as this, it is perhaps not a problem.
As systems grow in complexity, experience with ADLs has shown that explicit connectors become very important for handling internal component wiring. Below is the CTvPlatform composite component taken from the software used for certain Philips television sets. This is expressed using the Koala ADL, which I will discuss further below. Imagine creating this without explicit connector (or composite) support!
Some advanced ADLs even have first-class connectors. This means that connectors act a bit like components in their own right, and can encapsulate a network transport and any error conditions that might arise.
A (Biased and Imperial College-Centric) History of ADLs
ADLs evolved out of attempts to make software architecture explicit as a set of connected components. Conic, an early ADL that used Pascal to implement leaf components, was created by a number of professors at Imperial College (including my supervisors) in the mid 1980's. Conic used a separate textual configuration language (ADL) to wire up the components. This language is the equivalent of the Spring XML configuration facility. Conic was used to instantiate and wire up components in a distributed system, and supported runtime change through the application of dynamic changes to those wiring instructions.
Intriguingly, the 1985 paper on Conic first coined the term “configuration programming” which was in wide academic use by 1990. This is a term we are quite (un)comfortable with today, perhaps because of the over-emphasis on XML configuration in Java server-side systems.
The successor to Conic was Darwin which used C++ as for implementing leaf components. Darwin influenced Microsoft's COM, although this subsequently removed some of the elements that make ADLs so powerful, such as explicit connectors and textual configuration.
Darwin also was the foundation of Koala, an ADL used by Philips for software in some of its consumer electronics such as high end television sets. Koala is an interesting example of the use of an ADL in a very resource limited environment, and like Spring can deal with the wiring up of the internals of a non-distributed application. The Koala ADL also features a number of interesting control structures, one of which mimics an electronic bus. Another supports event pumps which cleverly allows logical threads to be efficiently mapped onto physical threads, allowing the optimisation of extensive threading in very “small” environments.
Darwin, Conic and Koala all feature composite components. COM supports a limited form of composition, and is a rather clever system despite its implementation complexity and obscure specifications.
An interesting ADL that doesn't have composite components is Chiron-2, also known as C2. This cryptically named ADL is designed specifically to model GUI architectures. The internal architecture of a C2 component provides a way to ensure reuse in different scenarios. C2 ultimately led to C2SADEL and C2DRADEL, which deal with system evolution and change to a running system.
It's also important for me to mention that only a fairly small subset of ADLs are implementation focussed. Most ADLs are only usable for analysis. The ADLs I know of that can wire together implementation components are Conic, Darwin, Koala (consumer electronics), MetaH (avionics), Unicon and Backbone (my ADL). I'm sure there are others.
It may surprise readers to know that the second version of the Unified Modeling Language (UML) contains a perfectly workable ADL somewhere inside the tangled mess of concepts. This subset of UML was heavily influenced by the amazing Objectime (later re-branded as Rational Realtime) and its associated ROOM methodology and ADL. Objectime featured a complete graphical modeling environment also and allowed a form of structural inheritance. Spring bean inheritance is related to this construct.
Finally, it would be remiss of me not to mention ACME, which is a generic ADL that can be used as an interchange format to reconcile all the other ADLs. Think of it as a principled union of all ADL concepts. This trend towards a generic approach is also taken by xADL, which can “act as the basis for the rapid development of domain/project-specific ADLs”. xADL is heavily focussed on configurations expressed using extensible XML schemas.
Why not just use an ADL?
Given the extensive research and history behind ADLs, why can't we just use these for our production systems?
Well, you couldn't even if you wanted to! At least not unless you work in certain closed commercial environments, or are prepared to deal with academic prototypes. ADLs simply didn't catch on in mainstream software engineering. They were perhaps “before their time”, and got swept away in the OO-mania of the period. It also didn't help that they weren't promoted outside of the academic community. I have personally experienced this, as I initially up creating my own ADL independently from any (direct) knowledge of the literature, simply because I didn't know that the academic area existed. In retrospect, I was probably implicitly influenced by COM and UML2, which were in turn influenced by ADLs.
Spring and other DI approaches, however, allow us to apply ADL principles to our architectures today. They are production-tested and fit well with modern Java (and other) development. My point is to show that these approaches have a solid background in theory, and also to point the way towards future innovations.
The future of Dependency Injection?
Despite the lack of initial mainstream acceptance, ADLs were (and still are) a fertile area for research. Basically, much of the work done in the ADL area now takes the form of specification and subsequent analysis. The structure of the system is described using a configuration, and other information is “hung off” the structure. This information might describe the properties of a component, or might be associated with the entire system. e.g. describing and analysing the concurrency properties of an architecture or determining when parts of an architecture are able to be dynamically upgraded in a running system (apologies for the last link – I can't find a freely accessible version of the paper). This information is then analysed for certain properties.
It is from this research that we can learn a lot about how DI might evolve. It's possible that Spring or another DI approach can bring these very useful techniques into mainstream usage.
(Please note that I'm not trying to take anything away from Spring or its DI approach. In particular, Spring offers many fascinating facilities (such as aspects) that were never considered in ADLs. My intent though is to look at what other ADL features we might be able to usefully incorporate in future work)
Dynamic runtime changes
When a system is expressed as wired up components, the architecture of the system can be remodelled through changing around the connectors and swapping in and out various components. OSGi and other systems have started to address this in terms of low-level lifecycle mechanisms, but DI / ADL approaches are able to deal with this at the architectural level.
Evolution and extensibility
All systems evolve over time, and similarly this can be represented by switching components and changing wiring. However, advanced ADLs are able to capture this information by overlaying the changes on the original design (using architectural deltas). This allows the new system to be expressed as changes to the old system, allowing someone other than the original creator to evolve or modify it without destroying the original version in the process. Some of my work deals with this area. Other research includes MAE, a version control system specifically design for handling ADL architectures.
Analysis of Behavior and Protocols
Since components communicate via interfaces, it is possible to model the protocol of a component (and interfaces) using an extended sequence diagram or using a simple textual language. This is basically “behavior driven design” for components. This information allows components to be checked when they are placed in a different configuration. i.e. Will a component function as intended when I wire it into a system?
There has been much work on checking the operation of components in a concurrent environment. In fact, entire books have been written about it. Concurrency bugs in large systems are notoriously hard to detect and remove, but the techniques in this area offer a relatively lightweight way to find these through analysis.
The earliest ADLs were designed to manage and activate the architecture of complex distributed systems, and monitor the nodes making up the distributed network. Darwin allowed a separate model of a set of computers to be mapped onto a component architecture, and it would take care of the distribution and error handling. You might like to think of this as an aspect-oriented form of component distribution which can avoid some of the pitfalls of transparent remoting.
(Spring DI can wire up a distributed application, but does not handle the activation side)
Composite components and certain other constructs (not covered here) allow the ADL component approach to be applied at fine-grained levels normally associated only with lightweight classes. This allows the same consistent approach to be used at the highest and lowest levels of a system.
DI is mainly used to wire together server side components. C2 showed that even client software could benefit, and encoded common GUI patterns within its structure. Perhaps this has already happened in Spring RichClient, and I've just overlooked it. There are a wealth of other domains and lots of other control structures to mine.
An enhanced component model
Adding full composite components and explicit connectors will add to the expressiveness of current DI approaches. Explicit ports would also be useful, as they complete the component model, but are outside of the scope of this short article.
Combinining components and modules
Although ADLs to some degree were seen as replacing modules, the need for modules didn't actually go away. Modules are used for packaging and deployment, and for distinguishing between the private, implementation part of a system and its public interface. Spring has gone some way to addressing this already with the OSGi integration. It will be interesting to see where this work goes.
Creation of product families
Koala was specifically designed to allow an entire product-line of television software to be created from one configuration by choosing options and plugging in different components into a base configuration. DI approaches can be easily modified to allow this, and are well suited to allowing this type of flexibility, which is often required in commercial systems.
Advanced design and analysis tools
A number of advanced tools exist to edit ADL architectures and analyse them for various properties. The Software Architect's Assistant manages Darwin configurations and handles the distributed mappings. ArchEdit is an Eclipse-based editor for xADL configurations. There are many more tools, but perhaps a common theme is the focus on compositional components and analysis techniques for areas like concurrency (safety and liveness). These techniques are generally backed by formal (logical/mathematical) specifications.
Dependency injection is here to stay. It is related to component techniques and architectural approaches that have proven themselves in the commercial and academic arenas for almost twenty years now. There are other aspects of ADLs that may be useful in the future to consider for DI, particularly the analysis side, and a more complete component model.
I started writing an online article about my PhD work, which is on system extensibility and evolution using ADLs. However, I decided I really needed to write this article first and get some of the background points out there. I wanted to do this because there appears to be little interest in DI in academia (been there, done that, academics don't usually care about industry take-up) and no visibility of ADLs and formal analysis in commercial development (looks too complicated, not production ready, inaccessible).
Of some discomfort to me is the fact that many of the papers I referenced are not easily available outside of the academic libraries and their electronic subscriptions. I had to hunt very hard to find free versions of many of the papers I wanted to cite.
It struck me as I was writing this article that there is a huge divide between software engineering practice and research. Whilst popular practical techniques will eventually form the basis for research, and some research will eventually filter down into practice, there doesn't seem to be a large interplay between the two communities, which strikes me as both surprising and disturbing. I'd be very interested to know your thoughts on this divide and whether it affects you in your work.