Eclipse Modelling Framework: Q&A With the Authors
Join the DZone community and get the full member experience.
Join For FreeThis week at EclipseZone has been all about EMF, with the launch of the refcard, a review of the new EMF book and today we meet the four authors behind the second edition of the Eclipse Modelling Framework book.
The gang of four that wrote the book are Dave Steinberg, Frank Budinsky, Marcelo Paternstro and Ed Merks. I asked them each about the book and about the Eclipse Modelling Framework including their top tips for it's use.
Competition Time! DZone, together with the Eclipse Foundation, are happy to offer you the chance to win either the new edition of the EMF book or an Eclipse shirt. To be in with a chance of winning just tell us about your use of modelling, EMF or what you like about EMF. We will then randomly select a winner based on the entries received by 3 February, 2009.
James Sugrue: Could you please introduce yourselves?
Ed Merks: My name is Ed Merks. I'm the lead for the EMF project and the co-lead for the top-level Eclipse Modeling project. I work as a software consultant in partnership with itemis AG. I've been working on EMF since the very beginning.
Frank Budinsky: My name is Frank Budinsky. I'm a senior architect at IBM and a founding member of the EMF project at Eclipse. I've been involved in framework and generator design for many years, including as design lead for the EMF-based Tuscany Service Data Objects (SDO) project at Apache, and as co-chair of the SDO technical committee at OASIS.
Dave Steinberg: My name is Dave Steinberg. I'm a software developer at IBM and an EMF committer. I started working on EMF with Ed and Frank before it was open-sourced in 2002
Marcelo Paternostro: My name is Marcelo Paternostro. I've been an EMF committer since 2004 and a user for even longer. I am actually quite proud for being one of the first developers to adopt it, when it was still an internal component at IBM. Before joining the EMF team, I've worked on TPTP's first incarnation, Hyades, and on components for IBM's WebSphere Studio Application Developer, now Rational Application Developer (RAD).
James: This has been a long awaited book. Did you find the second edition difficult to write? Were there any particularly difficult sections?
Ed: The long waiting was the most difficult part because I only wrote a very small part of the content. Mostly I worked very hard on my delegation skills, i.e., I delegated the challenging task of writing the book to Marcelo and Dave.
Frank: The problem with writing a book about a fast moving technology is that one can easily fall into the trap of not writing quickly enough to keep up with the changes happening to the technology itself. The distance to the finish line never seems to change since there's always something new that has to be covered. We fell into that trap for a little while on this book but with a concerted push at the end, we managed to finally get it done. With Dave acting as lead author for this edition, there would be no compromise in the quality and completeness, just to get it done. I think the end result speaks for itself and was worth the wait.
Dave: Yes, writing the second edition was a difficult task. EMF has grown so much since the first edition that a lot more judgement was needed in determining what to cover and in how much detail. I also think we set the bar higher for ourselves this time, demanding a higher level of quality in terms of content, consistency, and language.
For me, the most difficult sections were actually the same ones as last time: the ones that could really be described as reference material. For example, there are chapters describing the mappings from UML, annotated Java, and XML Schema to Ecore. There are also sections describing every last generator and resource option. There were lots of little details to cover in those sections -- many more than when we did the first edition -- and we were really striving for completeness. So, gathering and organizing all that information was pretty challenging.
By contrast, the rest of the book talks a lot about concepts and is very example-driven. I think that kind of material is much easier and more fun to write. It's probably more fun to read, too, but people need the reference material, too, when they're looking for an answer to a specific question.
Marcelo: This is my first book. To me it was very surprising to realize how hard it is to write about a technology that I know from end to end. It is amazing how challenging it is to put concepts in words and how distinct the writer's and developer's mind-sets are.
On a more personal note, English is not my native language (I am from Brazil and have moved to Canada 8 years ago). This book has given me a fantastic opportunity to improve my communication skills. It was like a huge "pair programming" experience to me: I would write the content that I set myself to work on and then often rewrite it with Dave, who I usually refer as an "English Grammar God". I like to think that he noticed a huge improvement between my very first lines back 4 years ago and the ones I wrote in 2008 ;-)
Dave: Okay, clearly I'm getting good value for the money I pay Marcelo to heap praise on me. And, yes, his writing has definitely improved since we started this project. Actually, I'm pretty sure that's true for all of us.
James: For those who have already read the first edition,and are familiar with EMF, what are the main chapters that they should take notice of?
Marcelo: We've revised and improved every single chapter of the first edition and added a bunch more. The new edition has more pages than the first one and it doesn't contain the API summaries!
Trying to answer your question without saying "all chapters", I would suggest Chapters 15, 20, and 21. The first is a very deep and complete trip to the persistence world. The second shows how to run EMF in environments that are not the usual Eclipse IDE and serves well as a good review of some important concepts. The last one, Chapter 21, could be called "what we were also doing while we were writing this book", as it shows the main features introduced in the 2.3 and 2.4 releases.
Dave: I'd definitely agree with Chapter 21. It covers a lot of important new topics, including generics, content types, reference keys, and Ecore validation.
For people working with XML Schema, I'd add that Chapter 8 and 9 are very important. The mapping from XML Schema to Ecore is totally different from when the first edition was written. Chapter 8 introduces the extended modeling concepts in Ecore that support this mapping and Chapter 9 details it.
There's so much new material in Chapters 14-18, all about programming with models and the EMF runtime, that I couldn't even choose.
Frank: I'd say someone that is familiar with the first edition should leaf through the table of contents and notice how much new material there is and how much more detailed the coverage is in many of the original chapters. EMF has come a long way in the last 5 years, and the second edition has thorough coverage of all the new features. I'm sure that most readers, no matter how familiar with EMF they may be, will notice lots of neat things they didn't know about.
James: In your opinion, what is the most under-utilized part of EMF?
Marcelo: Although I can certainly think of parts that are less used, I don't believe there is a definitive answer to this question. When a developer decides that it is time to move past the basic "this is a code generation tool" idea, she ends up finding something in EMF that fits perfectly with what she is doing or that is in total agreement with her way of designing and implementing code. And, fortunately I guess, this "something in EMF" varies from person to person. I truly believe that we have at least one user for each line of code we've written.
Dave: Yeah, I agree with that. I'm not sure I'd say that it's under-utilized, but there's a whole lot of flexibility in EMF that many people don't know about. Most people probably assume that a modeled class in EMF maps to a generated class with one field per feature. In fact, there are numerous memory-saving options available in the generator, and you can even opt to delegate all storage of feature values onto an arbitrary external backing store, giving you total flexibility to do relational persistence, soft references, and custom lazy loading, for example.
Most people probably don't need anything fancier than one field per feature. But, if at some point they find they do, they might not immediately be aware that EMF will probably still be valuable to them.
Ed: I think EMF is little like an iceberg, not so much because it's so big it can sink the Titanic, but rather that only the 10% above the waterline is noticed. As the guys suggested, that water line is defined differently by different people.
James: Considering the rise of modelling, do you find yourselves working alot with the other Eclipse projects?
Marcelo: I am very happy to say yes. Almost every week I am pinged by Eclipse committers to discuss EMF related designs. And, over the years, the discussions have progressed to advanced topics, which to me says that certain developers are trying to exploit all the features available in EMF or even tweaking it to better suit their needs. This is extremely rewarding.
Dave: Definitely. I've lost track of all the projects using EMF, but there's a lot of support and collaboration going on.
Ed: The usage of EMF has grown well beyond my capacity to track. These days I spend the vast majority of my time helping people to get the most value out of EMF. The modeling project as a whole is growing so dramatically that just working with all the other people providing the additional technology layers for the modeling onion is a full time task.
Frank: EMF is also being used by open source projects at other organizations, not just at Eclipse. For example, the Tuscany SDO project at Apache.org, is built using EMF.
James: It seems EMF really is core to Eclipse. I've seen that e4 is even based around an EMF model. What are your opinions on this? Has modelling finally been taken seriously?
Marcelo: I am very excited about this. I've been developing components for Eclipse for more than 8 years now and I always see places in which EMF could improve a lot my interaction with Eclipse's inner parts. For example, imagine having an EMF based implementation of the preferences API: one would be able to write reflective code to play with them, following a known model (Ecore). Not to mention that the serialization work would be done completely under the covers.
Although I am not aware of anyone working on the preferences bit, I was really excited when I heard that the UI team had decided to model the workbench for the new version of Eclipse.
Actually, in November I became an e4 committer and I am now working on this model exactly (besides my EMF and non-open-source commitments).
Regarding the increasing adoption of modeling, which we are all seeing, I have an interesting story to tell. When I came to Canada in 2000, I was actually shocked to see that people were not into things like class and sequence diagrams. Even Ed and Frank have confessed that they didn't really believe in models before working on EMF ;-) This was all very different from my previous experiences. In Brazil, both at university and work, dealing with models was a constant part of my day-to-day activities. For example, by '95, in the days before UML, I was fairly fluent in 4 or 5 modeling techniques. To the best of my knowledge, using models is also a "done deal" in Europe, which shows that there is a cultural aspect to this subject. I can't say precisely why modeling sounds more appealing to specific groups of people, but I am happy to see that I was not wrong when I chose to learn it more than a decade ago.
Ed: There are so many misconceptions associated with modeling that it's definitely an uphill struggle to get people to take note. I don't think it's so much that it's not taken seriously, but rather that people have a serious dislike for it as a starting point. As Marcelo suggests, even personally when I first started, I really did not like those darned class diagrams. It all smacked too much like those useless flowcharts we were forced to produce in university and which I always did after the fact. Besides, I didn't know how to read them and I believed that real programmers just don't draw silly pictures. Of course I've matured since then. I began to realize that a very simple high level abstraction---it doesn't have to be a picture---can specify very succinctly one heck of a lot of code that's otherwise terminally tedious and hence error prone to produce by hand.
Everyone seems to have to go through this same maturation phase on their own at their own pace.
James: From all of the people you've seen use EMF, what cases have made you smile?
Marcelo: I like a lot to see the entire ecosystem that is built on top of EMF on Eclipse. Being more precise, I remember actually cheering when I heard that the Jazz guys are using EMF to implement the infrastructure of services they offer.
Dave: Well, it was pretty mind blowing when I first heard that Airbus and NASA were interested in EMF.
But what's really cool, I think, is the number of smaller and independent consultants in Europe that have built businesses atop EMF, especially since they have been so active in giving back and helping build a modeling community at Eclipse.
Much closer to home, I remember many years ago when Ed and Frank returned from a trip to Rational (this was before Rational acquired by IBM) and reported that the team they spoke to had been blown away by EMF. That was the first feedback from anyone outside of IBM that I had heard about, so I was pretty impressed. Rational embarked on a new strategy, rebuilding their modeling tools on EMF, and ended up being acquired by IBM. And now, in a strange twist of fate, I actually work for that team!
Ed: I smile almost every day now because someone always seems to have done something surprisingly cool.
James: Apart from this book, what other EMF book would you recommend for a developers bookshelf. Both Eclipse and non-Eclipse.
Dave: The best thing about this book, from the perspective of an author that wants to sell many copies, is that there is no competition!
Seriously, as far as I'm aware, there really is no other current book with much more than a passing reference to EMF. Richard Gronback's upcoming book on the whole Eclipse Modeling Project looks most promising, though. There's so much powerful technology in the project, much more than just the EMF core we discuss in our book, so I think the two titles will complement each other really well.
Marcelo: I have to confess that I haven't read other Eclipse books. On Java, I like to read books that go really deep into subjects like performance or concurrence. Also, I often recommend that Java developers read any book that teaches languages that manipulate pointers directly, as I can't understand how one can really grasp what is happening in their code without this knowledge.
Ed: Mostly I read books that have elves and dwarves in them. There's so little personal time left most days that reading a technical book just never seems to happen.
James: What are your main tips for users of EMF?
Marcelo: "Be a better developer. Exploit all aspects of the tools that are within your reach. And, most important, be curious."
On several opportunities, I have noticed that people treat EMF like Java itself: they are only seeing the beautiful wrapper without caring too much about the content. Think about it: anyone can read a Java 101 book and start writing code that works. Probably this person doesn't know the difference between a list and a set and actually doesn't care about it as long as the application being developed does what it is supposed to do. Unfortunately several people stop at this level or just beyond it. And this applies even to some developers who have worked with Java for a few years.
Others, instead, choose to invest time to find out what is going on and what is available to them. And, when they do, they start to write code that improves the garbage collection effectiveness, or they discover techniques or existing code that, when properly used, reduces the complexity of their own implementations by an order of magnitude. The same goes for EMF: investing the time to understand it and to explore what is available can be extremely rewarding. If nothing else, such users will be reading code that has been improved for over 5 years with the input of thousands of people using it in surprisingly different contexts.
Ed: As Marcelo suggests, keep in mind that EMF is like an iceberg. There's a lot that doesn't meet the eye, but understand those parts and exploiting them will provide significant value. Take the time to learn something as often as possible. After this many years of clients with problems, we've solved a very large portion of them so it's very likely that there are lurking solutions you just don't realize are there. Don't be afraid to use the newsgroup.
Frank: My top 3 EMF tips:
- Start with clean models, If your model is a mess, so will be your
application. - Don't be afraid to regenerate. EMF's merging generator is really well
tested and safe. It's not going to corrupt or wipe out any of your hard
work. - Check out EcoreUtil. On page 503 of the book it says "if you ever find
that you're thinking about writing a generic EMF utility of some kind, you
should probably look in EcoreUtil first, as there's a reasonably good
chance it already exists".
James: What are the cases where someone should not use EMF, even though they may be tempted?
Dave: To be honest, after years of helping people with their use of EMF, I haven't yet encountered a single case where I've concluded that they shouldn't be using it in the first place. I'm not claiming that such cases don't exist, just that I haven't found them yet. And, I must admit, I don't spend much time sitting around wondering what they might be.
Marcelo: There are several cases... When someone is building applications using C, C++, Ada, Cobol, ... Also when a developer is not in front of a computer or, better, an electronic device that could be actually running an EMF based application.
In all seriousness, this is a tough question to answer. Not only because I am an EMF fan and advocate but because I can't picture an application that doesn't convey some sort of model. And if there is a model, it would be hard to find reasons not to have it properly specified and then even harder not to use a tool that does a great job on transforming its concepts into code.
Ed: I think if your application involves manipulating structured data in Java, it's highly likely that EMF will provide significant value in terms of formalizing that as a model. EMF will not help you write the algorithms themselves though.
Frank: I wrote a Dr. Dobb's Journal article about EMF a few years ago, in which I said "Have you wondered how many of the applications you write manipulate data? The answer is pretty close to 100 percent.". I followed this with my usual blurb about how EMF lets you leverage the explicit or implicit model of your data to build your application more quickly and correctly. I was basically trying to say that EMF could be used to help write almost 100% of your applications.
About a week or two later, I received a phone call from an old friend of mine from University, who I hadn't talked to in years. He said he was reading his latest issue of DDJ and saw my name. It was a pleasant surprise of talk to him, but after catching up, he mentioned that my "close to 100%" comment didn't apply to him. Ouch :-) He'd been writing all kinds of strange control logic which really didn't have much of a data model. Oh well, maybe I exaggerated a bit, but even my old friend agreed that the number is probably still very high for typical applications.
James: What is your own personal favourite chapter in the book?
Marcelo: I have to say Chapter 2. I have never found a text that summarizes hard concepts as well as it does. Before you ask, no, I didn't write it ;-)
Dave: Do I really have to pick just one? Chapter 2 really stands out for me, too. It's a fantastic introduction to all of the most important concepts in EMF. It's packed with information, and yet very readable. I could easily imagine publishing it, on its own, as a concise guide to EMF. It's also one of the very few parts of the first edition that was left almost completely unchanged for the second. Only a little bit of language polishing was done.
Another favourite is Chapter 20, which talks about using EMF in RCP and stand-alone applications. Most of the discussion in the book is quite focused on EMF in the Eclipse IDE, so this chapter revisits some of the information already discussed, and pulls it together with additional details relevant to running EMF in different contexts.
Of the chapters that I wrote, I would probably have to pick Chapter 15, which is about persistence, just for the sheer amount of content packed into it. It's not as concise as Chapter 2, but I think it contains some of the most detailed discussion in the book, and answers so many commonly asked questions. I hope people will find it helpful.
See? That's three favourites! I guess I couldn't do it!
Ed: My favorites are all the ones I didn't have to help write.
Frank: OK, I wrote Chapter 2 :-) I agree it's a pretty good chapter for beginners, but for more advanced EMF users, the detailed reference chapters are really the most valuable ones. Since I work a lot with XML
these days, I find the wealth of information about the XSD mapping in Chapter 9 to be what I tend to look at the most. I guess that makes it my favorite.
Competition Time! DZone, together with the Eclipse Foundation, are happy to offer you the chance to win either the new edition of the EMF book or an Eclipse shirt. To be in with a chance of winning just tell us about your use of modelling, EMF or what you like about EMF. We will then randomly select a winner based on the entries received by 3 February, 2009.
Opinions expressed by DZone contributors are their own.
Comments