DSL Adoption with JetBrains MPS
The Integration Zone is brought to you in partnership with Red Hat. Download the IDC Report: The Business Value of Red Hat Integration Products to learn more about Red hat Integration.
DSLs, or Domain Specific Languages, have been known in software engineering for many years. Despite this fact, they aren't widely used today. In this article we take a look at what DSLs are and why they aren't widely accepted by mainstream developers. Then we describe how JetBrains MPS solves the main problems which stop DSLs from being widely used.
A DSL is a language tailored to a particular problem domain. For example, declarative languages designed to solve narrow domains of problems are DSLs. Among such languages are SQL, Regular expressions, XPath, and Prolog. The main advantage of DSLs is that they are very close to problem domains, which implies that problems from the domain can be specified and solved very succinctly in such languages. Another advantage is that in order to code in a DSL, a person doesn't have to be a software developer. You can give a DSL to a domain expert, and she can write code in it thanks to her knowledge of the domain itself. This looks simple in theory: just take your domain, look at its abstractions, create a language for it, and describe/solve your problem in this language. But despite this simplicity, we rarely see DSLs in real world programs.
Why DSLs aren't widely used in mainstream programming
Let's take a look at what prevents DSLs from being widely used. The first reason is focusing on closed DSLs rather than on creation of general-purpose language (GPL) extensions. This is probably because of potential ambiguities in text-based languages. The other problem is the complexity of creating a good IDE support for such languages. Let's take a closer look at these reasons, one by one.
Most of the effort in DSL community is directed toward standalone DSLs, but we can get much more value if we add new constructs to an existing language, for example Java. If we can use language extensions, in a way we use libraries and frameworks, these languages have much greater value to us than if we just create a standalone language and use it for some part of our program. This is especially noticeable when we have languages intended to be edited by professional developers. With this approach developers can have the high abstraction level of DSL and the power of GPL at the same time, which isn't achievable in existing DSL technologies.
In addition to being able to create extensions for existing GPLs, we need to make them composable since reuse is hardly possible without this property. This means that if we have a language extension, say one that adds support for monetary values to Java and another language extension, for example one that adds support for useful mathematical notation to Java, we must be able to use them in the same program simultaneously despite the fact that they are created by different vendors.
In order to solve GPL extensibility and composability problem, we need to first solve another important problem: problem of text-based language ambiguity. Almost all of the existing languages are implemented with text-based grammars. These grammars have one very upsetting issue: they can be ambiguous, i.e. there might be several interpretations of the same program. Even worse, even if we make the grammar of <GPL X + extension A> unambiguous and make the grammar of <GPL X + extension B> unambiguous, this won't mean that the grammar of <GPL X + extension A + extension B> is unambiguous. As a result, such extensions won't be composable, which greatly reduces their value.
In order to be productive, a developer needs intelligent tools at hands. With the invention of intelligent code editors such as IntelliJ IDEA and Eclipse, developers feel really bad when they have to use plain text editors for other languages. These editors don't highlight errors, don't provide context-sensitive help, and don't show completion menus with possible values. There are frameworks for such editors, for example IntelliJ IDEA Language API, XText and Oslo but neither of them supports composable language extension. Even with these frameworks, creating an industrial grade language support requires substantial knowledge of programming languages and takes a lot of time. As you can see, tools support is very important for developers but such tools are very hard to implement.
Let's summarize the main problems:
People focus on a wrong kind of DSLs. In order to achieve substantial productivity gain, we need to create GPL extensions instead of DSLs. This problem arises because text-based languages are not composable and this composability isn't supported by current technologies.
How MPS solves these problems
Now let's take a look at how MPS solves the problems described above. In order to allow language extension, MPS doesn't work with program as a text; instead it stores a program as an abstract syntax tree (AST) and edits it directly. This choice of storage greatly reduces complexity of creating IDE support. Since MPS has the complete syntax tree at hand, it can easily provide context-sensitive completion, error checking and other crucial aspects of an intelligent editor.
MPS solves the problem of text grammar ambiguity in a radical way: by not storing language code as text. If there's no text, there's no text grammar, and thus no need to use a parser with all of its associated grammar ambiguities. Instead, MPS stores each program directly as a syntax tree. This does not mean that languages in MPS don't have grammar. They have one, but it isn't text-based or concrete grammar - it's an abstract grammar. Abstract grammar describes the structure of the program's syntax tree. If you are familiar with XML, MPS's abstract grammar is very similar to XML Schema.
As grammar ambiguities are eliminated, languages can be easily combined with each other. You can add new constructs to an existing language, which is called language extension. You can embed General Purpose Languages (GPLs) inside of DSLs as well. This means that languages are composable, which promotes reuse of languages. In MPS, we have done a lot of experimenting with language composability. We have created many Java extensions:
- Collections Language, which adds first class support for collections in the style of C#'s Linq;
- Dates Language, which allows working with dates more easily;
- Math Language, which adds mathematical constructs like sums and intervals directly to the language; and many more.
Most of the language definition languages that we have in MPS both embed and extend Java. For example, Type System Definition Language has type system rules which look like usual DSLs, but instead of having custom syntax there, you write Java code within special constructs inside of the rules.
Since we got rid of the text-based storage, we cannot use a regular text editor. In order to edit the syntax tree, we use a special editor. In MPS, we have a projectional editor. For each syntax node, it creates a projection which can be interacted with, and as a result of these interactions, the syntax tree is changed. In MPS, we've done our best to make this editor as close to a text editor as possible. For example, if you type 1+2+3, the syntax tree for this expression will be created. You don't need to choose PlusExpression from the completion menu twice - our editor behaves just like a text editor in many respects. Of course, not everything that is possible in a text editor is possible in the MPS editor, but all such issues can be worked around without losing productivity. In our experience, a person can get used to our editor so that she is as productive there as in a text editor in two weeks' time.
As a result of syntax-tree-based storage, smart editing features are much easier to implement. In fact, many of them are provided automatically, for example code completion, find usages, and version control support. During the development of IntelliJ IDEA, we have created support for many text-based languages. Implementing such support takes at least several man-months. In MPS, similar features can be implemented in several days. This is
so easy because we have the existing infrastructure for such features and languages with which this infrastructure can be configured.
MPS isn't just an editor. You can create a complete IDE for your languages with its help. We have languages for editor customization, scoping rules definition, type systems, generators, and data flow. All these languages are created with themselves, fully applying the bootstrap principle.
We have also been using MPS for developing commercial software. Our new bug tracking system, code named Charisma, is created completely with MPS, and more programs are on the way.
DSLs aren't widely used in mainstream programming for two main reasons: because text- based languages are usually impossible to combine, and because it is difficult to create an intelligent editor for them. MPS solves both of these problems by getting rid of text-based code storage and providing a common infrastructure for intelligent editor creation.
MPS Beta 2 was recently released under Apache 2.0 license. We are going to release MPS 1.0 in May, 2009. You are welcome to download it from http://www.jetbrains.com/mps and start creating DSLs right now.