I’m currently in the process of implementing Seph, and I’ve reached an inflection point. This point is the last responsible moment to choose what I will target with my language. Seph will definitely be a JVM language, but after that there is a range of options - some quite unlikely, some more likely. The valid choices are:
- Target Java 1.4
- Target Java 5/6
- Target Java 7
- Target Java 7 with extensions
Of these, the first options isn’t really interesting for Seph, so I’ll strike it out right now. The other three choices are however still definitely possible - and good choices. I thought I might talk a little bit about why I would choose each one of them. I haven’t made a final decision yet, so that will have to be the caveat for this post.
Before talking about the different choices, I wanted to mention a few things about Seph that matters to this decision. The first one is that I want Seph to be useful in the real world. That means it should be reasonably fast, and runnable for people without too much friction. I want the implementation to be small and clean, and hopefully as DRY as possible - if I end up with both and interpreter and just-in-time compiler, I want to be able to share as much of these implementations as possible.
The easiest way to go forward would be to only use Java 5 or 6. This would mean no extranice features, but it would also mean the barrier to entry would be very low. It would mean development on Seph would be much easier and wouldd in general make everything simpler for everyone. The problem with it would mainly be implementation complexity and speed, which would both suffer compared to any of the Java 7 variants.
There are many good reasons to go with Java 7, but there are also some horrible consequences of doing this. For Seph, the things that would make things from Java 7 is method handles, invoke dynamic and defender methods. Other things would be nice, but the three previous ones are the killer features for Seph. Method handles make it possible to write much more succinct code, not generate lots of extra classes for each built in method, and many other things. It also becomes possible to refer to compiled code using method handles, so the connection between the JIT and the interpreter would be much nicer to represent.
Invoke dynamic is quite obvious - it would allow me to do much nicer compilation to bytecode, and much faster. However, I could still build the same thing myself, to much greater cost and it would also mean inlining wouldn’t be as easy to get.
Finally, defender methods is a feature of the new lambda proposal that allow you to add new methods to interfaces without breaking backwards compatibility. The way this works is that when you add a new method to an interface, you can specify a static method that should be called when that interface method is invoked and there are no other implementations on the concrete classes for a specific object. But the interesting side effect of this feature is that you can also use it to specify default implementations for the core language methods without depending on a shared base class. This will make the implementation much smaller and more flexible, and might also be useful to specify required and optional methods in an API.
The main problem with Java 7 is that it doesn’t exist yet, and the time schedule is uncertain. It is not entirely certain exactly what the design of the things will look like either - so it’s definitely a moving target. Finally, it will make it very hard for people to help out on the project, and also it won’t make Seph a possible language for people to use until they upgrade to Java 7.
Java 7 with extensions
It turns out that the interesting features coming in Java 7 is just the tip of the iceberg. There are many other proposed features, with partial implementations in the DaVinci project (MLVM). These features aren’t actually complete, but one way of forcing them to become more complete is to actually use them for something real and give lots of feedback on the feature. Some of the more interesting features:
This feature will allow you to say after the fact that a specific class implements an interface, and also specify implementations for the methods on that interface. This is very powerful and would be extremely helpful in certain parts of the language implementation - especially when doing integration with Java. The patch is currently not very complete, though.
Allowing the JVM to perform proper tail calls would make it much easier to implement many recursive functional algorithms easily. Since Seph will have proper tail calls in the language, this will mean that I will have to implement this myself if the JVM doesn’t do it, which means Seph will be slower based on this. The patch seems to be quite good and possible to merge and harden to the JDK at some point. Of all the things on this list, this seems to be one of things that we can actually envision see being added in the Java 7 or Java 8 time frame.
Both coroutines and continuations seem to be possible to do in a good way, at least partially. Coroutines might be interesting for Seph as an alternative to Kilim, but right now it seems to be a bit unstable. Continuations would allow me to expose continuations as a first class citizen which is never bad - but it wouldn’t give me much more than that.
Hotswapping of code would make it possible to do agressive JITting and then backing out from that when guards fail and so on. This is less interesting when we have invoke dynamic, but will give some more flexibility in terms of code generation.
Fixnums, tuples, value types
We all want ways of making numbers faster - but these features might also make it possible to efficiently represent simple composite data structures, and also things like multiple return values. These are fairly simple features, but have no real patch right now (I think).
Light weight code loading (anonymous classes)
It is horrible to load byte code at runtime in Java at this point. The reason is that to be able to make sure your loaded code gets garbage collected, you will have to load each chunk of code in a new class in a new classloader. This becomes very expensive very fast, and also endangers permgen. Anonymous classes make this go away, since they don’t have names. This means you don’t actually have to keep a reference to older classes, since there is no way to get to them again if you lost the reference to them. This is a good thing, and makes it possible to not generate class loaders every time you load new code. THe state of this seems to be quite stable, but at this point JVM dependent.
Of course, all of these lovely features comes with a price. Two prices in fact. The first price is that all the above features are incomplete, ranging from working patches to proof of concepts or sketches of ideas. That means that the ground will change under any language using it - which introduces hard version dependencies and complicates building. The other price is that none of these features are part of anything that has been released, and there are no guarantees that it will ever be merged in Java at any point. So the only viable way of distributing Seph would be to distribute standard build files with a patched OpenJDK so that anyone can download and use that specific JDK. But that limits interoperability and causes lots of other problems.
Somewhere in between
My current thinking is that all of the above choices are bad. For Seph I want something inbetween, and my current best approach looks like this. You will need a new build of MLVM with invoke dynamic and method handles to develop and compile Seph. I will utilize invoke dynamic and method handles in the implementation, and allow people to use Rémi Forax’ JSR 292 backport to run it on Java 5 and 6. When Java 7 finally arrives, Seph will be more or less ready for it - and Seph can get some of the performance and maintainability benefits of using JSR 292 immediately. At this point I can’t actually use defender methods, but if anyone is clever enough to figure out a backport that will allow defender methods to work on Java 5 or 6, I would definitely use them all over the place.
This doesn’t actually preclude the possibility of creating alternative research versions of Seph that uses some of the other MLVM patches. Charles Nutter have shown how much you can do by using flags to add features that are turned off by default. So Seph could definitely grow the above features, but currently I won’t make the core of the language depend on them.