Square Data Framework 0.1: The Beginning
Join the DZone community and get the full member experience.Join For Free
I am extremely proud to present the initial version of our debut commercial offering, one which we believe will revolutionize the way all data-driven applications are created. We believe it to be so revolutionary, that it will make much of what we presently know as the middleware industry, redundant.
This story begins a little over 18 months ago, during one of our post-project debriefing sessions.
The project was a particularly interesting one, in that the system we where required to create had to run off a centralized LAMP server, but also had to be usable offline. In the end we opted for a Swing-based fat client. As you might imagine, getting it to sync with the PHP-based server was not a pleasant experience.
As is usual during these sessions, I asked the team whether they had any advice as to how we might improve the process in future, to which one of our developers sarcastically remarked that it would be nice if the whole application could just generate itself from the spec sheet. Those of us with development experience got the joke.
One of our designers, however, didn't find it all that funny. She explained that she thought that the whole point of a computer was to process human instructions, nothing more. If that was so, surely we would simply need to make sure that the specifications are in a form that the computer can understand. She actually had a very valid point.
I decided to set aside some resources each week to investigate whether this point could be turned into something more than just a nice idea.
It turns out, it could, and what we have to show is far more than we could ever have dreamed of.
Before continuing, I'd like to point out that we are not claiming to have created the silver bullet for all types of software. Our system will not allow you to create the next Doom 3 or Photoshop, for example. Our focus lies very specifically on data-driven software systems, that is those which model real life constructs and processes – known in some circles as business software.
There were two distinct parts to creating our system.
Firstly, we had to design a specification format which was both suitable for digital processing and understandable by non-programmers, yet at the same time, capable of describing all application requirements.
Secondly, we had to develop software capable of creating working applications from specifications in this form.
Central to both parts was deciding on a particular type system suitable for modeling our domain.
Initially, the plan was to work in reference to an object oriented type system, as the idea was to create a code generator which could generate the necessary code from the formatted specifications.
That idea didn't sit well with our database expert, however, who favored a relational model. He argued that the real value of any data set was in the relationships between the individual pieces of data, something that was not well supported by the object-oriented model. He also had a point. One that countless others had made before him; it was simply another incarnation of the much loved object-relational impedance mismatch.
It was while thinking about a solution to this problem, that the epiphany came. Using an object-oriented type system to model a data domain makes no sense at all. This might sound like an anathema to some, particularly if you're an application server vendor, but the reasons make perfect sense.
What it boils down to, is that object-oriented type systems were never designed to model real world domains in the first place. The object-oriented paradigm is first and foremost, a programming paradigm – it was designed to model sets of instructions going to the CPU, not real world domains.
If you try to model your domain using the same type system that you can use for creating user interfaces, it is inevitable that you will encounter more than your fair share of difficulties.
Industry's solutions to these problems have taken what we like to call an evolutionary approach to innovation. Instead of trying to solve the underlying problem, which we would call taking a revolutionary approach, the preference has been to provide workarounds to make the already existent, but inefficient, solutions more efficient.
Instead of developing a screwdriver, industry has decided to develop tools which make screws look more like nails.
This approach has lead to the development of things such as criterion API's, annotation API's, meta-model API's and expression languages – none of which formed part of the original object-oriented programming paradigm.
The result, as with the hammer and screw analogy, is a development process which is both needlessly inefficient and complex.
From the point of view of the software megacorps, however, this evolutionary approach to innovation is the only one which makes sense – it makes no sense for them to invest in something which would make their flagship products redundant, no matter how efficient it may be.
We decided, therefore, to frame our type system purely in terms of the fundamental problem, and not in terms of any of the presently available, but inefficient, solutions to the problem.
It follows that our type system is completely platform and programming language independent.
At the end of the day, the only thing that matters to the client is whether the system can handle all of the required business cases, whether it can safely and securely get all data from one place to the next, and whether it can do all of this in a manner which is financially viable. Whether the system is implemented using a troop of message-passing monkeys or whether it uses some advanced, space aged technology, is irrelevant.
Of course, depending on the client's priorities, the set of suitable technologies changes. For clients who prioritize the stability and security of the system above all else, we believe that the only suitable solution is a JVM-based one. At present, that means going for either a JEE or Spring-based solution. Nevertheless, this is only a consequence of the state of currently available technologies, and not of the client's requirements per se.
Perhaps the greatest advantage of our programming language independent type system, is that it allows data to travel from one system to another without any loss in semantics, that is, it completely eliminates any impedance mismatch. The implications of this as far as system scalability and interoperability is concerned, are highly significant.
The constraints placed on data defined in terms of programming language specific, object-oriented type systems, in this regard, are enormous, and have required complex solutions which are otherwise necessary.
If we take a look at a typical Java-only system, there are a couple of examples of this.
The most obvious, is the complication required to persist and load data defined in terms of Java objects, using a data store not designed in terms of an object-oriented type system. This has given rise to the use of so called object-relational mappers, which, while offering a seemingly sufficient solution to the immediate problem, are not able to make efficient use of the benefits provided by the relational model.
The other issue occurs when there is a need for some sort of distributed computing, either for scalability purposes, or to expose network access to facilitate interaction with rich clients. Due to the loss of semantics incurred by data outside of the context of a Java object graph, means had to be devised to allow for networked accessing of the object graph. In response to this, we have seen the development of things such as RMI and the Terracotta solution. While these certainly are impressive feats of engineering, we believe that they solve problems which needn't be there in the first place.
The situation becomes even worse when inter-system interoperability is concerned, which is becoming a common requirement as businesses become increasingly interdependent.
In these cases, the lack of a common programming platform means that the above solutions are not an option. We then have to resort to using so-called web services and ESB's, both of which inherently require data to be transferred in a lowest common denominator form, losing all the benefits provided by their native type systems.
We believe that all of these issues can be eliminated by using a type system designed to transfer data without incurring any loss in semantics, which is exactly what we have developed.
Our type system reduces all data into sets of related key-value pairs, where each set represents a particular domain entity, and each key-value pair, a particular entity property. This is compatible with both the relational model as well as that of the growing NoSQL movement. This model makes sharing data, either between intra-system components or between independent systems, a trivial affair.
The type system defines 12 native value types; 9 primitive types, 2 collection types and an object pseudo-type, which defines a relationship to another entity.
It is in terms of this type system, that we have defined our application specification format.
Each specification takes the form of an XML document matching a particular schema. We've decided to go for XML as it is the most universally understood markup language, and its validation support is fairly good. Since each specification declares the set of entity types to be supported by the application, we have come to call them type set declarations, or simply type sets, for short.
Each type declares all data-related requirements pertaining to a particular entity type. At a minimum, this just includes the set of properties defining that type.
Each property may also declare one or more validation requirements. Each validation requirement may be defined in terms of an expression, which may be dependent on one or more other properties. This allows you to declare, for example, that Property A must be at least equal to twice the value of Property B raised to the power of Property C.
In addition to standard properties, virtual properties may also be declared. A virtual property is one whose value is defined by an expression. For example, a shopping cart's total may defined as the sum of the prices of its contents.
In addition to properties, a type may declare a set of supported actions. Each action may define a number of input parameters, each of which may also define a number of validation requirements. Each action also defines a sequence of operations. Each operation defines either a modification to the data set or a service execution. We have restricted the definition of the service term to refer only to operations which have no affect on the data set.
An example of an action would be the shopping cart's Purchase action. It would take a single PaymentMethod object as an input parameter. Its first operation would be a service execution that processes the payment, using both the provided payment method and the shopping cart's total property. The next operation would be a data modification which transfers the shopping cart's contents to a newly created PurchaseOrder object. The final operation would be another service execution thats sends the confirmation and notification emails.
Finally, each type may declare a set of predefined queries. Each query may optionally define a number of input parameters, each of which may also define a number of validation requirements. Each query declares a condition which defines a particular sub-set of entities. The condition may be composed of a number of sub-conditions and may be defined in terms of the input parameters, or expressions thereof.
We believe that this provides a suitable basis for conclusively describing any set of data-related requirements. We do foresee some critics arguing that it would be naïve to assume that such a high-level construct could be used to fully describe even the most complex of business processes. We believe this to be an unfounded argument, however. The fact that clients are able to describe their requirements without resorting to low-level implementation details, refutes this claim.
As alluded to earlier, this is only one part of the story. The other part required us to develop something which could transform a type set into something useful.
It was during this development that we discovered the true power of the object-oriented paradigm. Its real strength lies not in the fact that it allows you to model real world entities and processes, but in the fact that it allows you to model abstractions themselves.
We believe the significance of this to be ground-breaking, especially as far as software management is concerned.
The prevailing wisdom has been to take a quantitative approach towards project management, as evidenced by the existence of MS Project and the like. This is quite possibly what makes for the worst part of any corporate developer's job.
While still working in the corporate sector, I personally suffered the misfortune of having been tasked to collect and deliver the 'numbers' to our non-technical managers each week. It was a constant struggle, trying to explain to management why our tasks were taking longer than the estimates that we ourselves had provided.
For some reason, management seemed unable to grasp the fact that no two development tasks are ever the same, no matter how similar they may appear, and it is for this reason that we were extremely unwilling to provide estimates in the first place. When we were forced to provide estimates, it was always with the disclaimer that they couldn't be considered accurate, for this reason. Somehow, they always ended up on the project plan regardless, requiring me to re-explain what we had already tried to explain in the beginning, and making Friday's my worst day of the week in the process.
The fact that these types of practices are still in place the world over, indicates that, while fundamentally flawed, they still provide a suitable amount of accuracy to be considered useful.
We now believe that any accuracy in a project's estimates is an indication of an unnecessary amount of repetition required by the development system. In practice, this equates to the amount of boilerplate code required by the system. It also indicates, however, the presence of abstractions which, when extracted effectively, can eliminate all the redundant repetition. This results in a much more efficient development process, though it will require traditionally-minded managers to adopt a new way of thinking.
The abstractions that we have identified for our domain of data-driven applications, are, in fact, those defined by our type system, since it was designed explicitly to model the requirements of these types of applications.
As a result, our implementation includes class definitions for each of the elements that can be declared in a type set. This means that everything is an object, starting with the type set itself, right through to individual validation requirements, query conditions, expressions and property paths. In this regard, we would consider our implementation to be far more object-oriented than any of the other presently available solutions; there is no use of byte-code manipulation, annotations, aspects or unnecessary XML files, and certainly no use of 'magic strings' for any purpose, whether it be for defining validation constraints, queries or anything else. The ability to access each part of our type set as an object, has allowed us to do things which wouldn't even be possible, let alone feasible, using current solutions.
Now, having covered the background, I can explain what to expect from our demo.
- Before starting, it will request the locations of the type set to be used as well as a corresponding SQLite database.
- It will first check the validity of the provided type set.
- It will then check the provided SQLite file. If it is empty, it will automatically create a new database with a schema compatible with the provided type set, including association tables for each of the collection properties. If it is not empty, it will check each of the tables common to the type set to ensure that they are structurally compatible, that is, they contain columns for each of the declared, non-virtual properties. It will create new tables for each the types newly declared by the type set.
- Following this, the application-proper will start.
- The first thing you'll notice is a large tabbed pane with a tab for each of the declared types. The label of each of the tabs will reflect the plural display name declared by the relevant type.
- The header of each tab will contain two buttons. The first button will be labeled “New X”, where X is the type's declared display name. The other button will be labeled “Create Query”.
- The body of each tab will include another tabbed pane. Initially, it will only contain a single tab labeled “All”. Depending on whether a display text expression has been declared, this tab will contain either a list or a table, with columns for each of the type's declared properties, having headers matching each of the properties' declared display names. Regardless of whether it is a list or a table, this will contain representations of each of the entities contained in the data source of the relevant type. If it is a list, the entities will be sorted alphabetically.
- If the New X button is pressed, a dialog will appear with a title matching the type's display name.
- This dialog will contain a field for each of the type's declared properties. The label of each field will match the property's display name. The type of each field is dependent on the property type. For text properties, a text field is used. For numeric and temporal properties, a formatted text field is used. For collection properties, a list is used. For object properties, a combo box is used. The combo box will contain sorted representations of all the entities of the matching type. If the property is virtual, its field will be uneditable. To the right of all editable fields will be a validation indicator; a tick if the field's value fulfills all declared validation requirements or a cross if not. If it is a cross, its tooltip will contain a list of messages declared by the failed validation requirements.
- If a display text expression has been declared, the dialog's title will be augmented with the result of this expression as each of the relevant properties' fields are updated.
- At the bottom of the dialog will be a “Create” button. This button will remain disabled until all fields contain valid input.
- Once the Create button is pressed, the dialog will close and a new entity will be created with the specified property values.
- At this stage, you will notice that the list or table in the middle of the screen will now contain an entry representing the newly created entity. This will be your first encounter with a feature which we believe to be a world first – live queries.
- If you double-click on the newly created entry, a new dialog will be opened, similar to that used to create the corresponding entity, which will display all the entity's property values. Unlike the dialog used to create the entity, none of the fields will be editable and the Create button will be replaced by an “Edit” button.
- Once the Edit button is pressed, all fields which may be edited, that is all those corresponding to non-virtual, mutable properties, will become editable. In addition, the Edit button will be replaced by “Apply” and “Cancel” buttons. The Apply button will become disabled as soon as any invalid property values are entered. Pressing the Cancel button will reset any changes while pressing the Apply will persist them.
- Back on the main tab, pressing the Create Query button will present a dialog allowing you to define an ad-hoc query.
- The first section of this dialog allows you to specify the properties to be included in the result set. These can be either direct or indirect properties, where an indirect property selection specifies a path from the source type to the destination property. To select these properties, a special path menu is provided. The top section includes all direct properties while the bottom section contains sub-menus which allow the navigation to indirect properties.
- The next section allows you to specify a condition which restricts the set of entities included in the final result set. An editor is provided to allow you to specify a potentially unlimited tree of sub-conditions and condition groups. Each single sub-condition may place a constraint on either a direct or indirect property, virtual or otherwise. This constraint may be defined in terms of either a literal value or a relative property value, indirect or otherwise, virtual or otherwise, of a relevant type.
- The third sections allows you to specify zero or more properties, indirect or otherwise, virtual or otherwise, which define the sort order of the result set.
- The last section allows you to restrict the number of entities in the result set. You can specify whether the results should start at a specified offset as well as whether they should be limited to a finite number of entities.
- Once you press the OK button, the dialog will close and a new sub-tab will be created alongside the already existent All tab.
- The new tab will contain a table with columns for each the properties specified in the first section of the previous dialog. It will contain entries for all those entities which fulfill the specified conditions and fall into the specified range after being sorted according to the specified order.
- This table will always remain up-to-date. Whenever an entity's property values change such that it no longer fulfills the specified conditions, it will be removed. Whenever a previously unmatching entity's property values change such that it does fulfill the conditions, it will be added – in the correct order. Whenever an entity's property values change such that it needs to be reordered, it will be moved. Whenever any of an entity's observed properties are changed, the relevant column will be updated. Whenever a new entity is created which fulfills the specified conditions, it too will be added, in the correct order.
And this – the support for live queries – is the primary reason why current solutions will never be able to compete. The constraints placed on a domain model defined in terms of an object-oriented type system, make this nature of support impossible.
The realms of possibilities opened up by this type of support is almost unimaginable. We can easily imagine a situation in retailing where, the moment a purchase is rung up, it is instantaneously reflected on not only the CEO's live dashboard, but also on the watch-lists of the relevant depots and suppliers. In fact, it is exactly this type of situation which will be supported by our final, 1.0 release.
It must be stated that the implementation which ships with this initial release does not support the full set features to be supported by the final release. For this reason the demo does not yet support actions or predefined queries. To gain an overview of the order in which features will be introduced, we recommend looking through the roadmap on our website.
The primary focus of the initial release has been on the development of an implementation architecture which is highly flexible and pluggable, especially as far as the user interface is concerned, as we are certainly aware that one size does not fit all.
Our demo is built on top of our officially supported application framework, all the sources of which are available. The application framework is completely toolkit-agnostic, meaning porting our Swing-based demo to either Apache Pivot, JavaFX , SWT or Android should be fairly trivial.
The intent of the implementation is to completely decouple all aspects of the application.
Firstly, the data source implementation is abstracted away. It is, in fact, impossible for any parts of the user interface to get hold of references to any of the JDBC code or any of the interfaces it implements.
Secondly, all parts of the user interface have no dependencies on each other. The application framework provides a type-safe registry which is used to create all user interface components as well as to facilitate all user actions.
For example, the button that is used to create new entities has no reference to the code which creates the resulting dialog. It simply informs the framework that the user has requested to create a new entity of a particular type. The framework then checks its registry to see if any special implementation has been provided for that type, otherwise it falls back to the default – in the case of the demo, the dialog mechanism.
The dialog mechanism, likewise, has no reference to the code which creates the view it contains, it only provides the buttons and the frame. It simply requests a new view for a particular type of entity from the framework. Again, the framework checks its registry to see if any special implementation has been provided for that type, otherwise it falls back to the default – in the case of the demo, a generic view which automatically generates itself based on the type's declared properties.
This generic view also has no reference to the code which creates its editors; it only arranges them and provides them with labels and validation indicators. It also simply requests a new editor for a particular value type from the framework. Again, the framework checks its registry to see if any special implementation has been provided for that value type, otherwise it falls back to the default. It goes without saying that the editors have no dependencies on the underlying data; they are informed by the framework which value they need to take and whether they may be editable.
Finally, the query builder also has no reference to the code which creates the resulting tables. It simply requests a new view for the particular query from the framework. Again, the framework checks its registry to see if any special implementation has been provided for that query type, otherwise it falls back to the default – in the case of the demo, the table.
This results in a user interface platform which is as flexible as can be, yet stills allows the use of a customizable set of defaults. We believe it also to be one which is completely vendor-friendly, as it allows, for the first time, for user interface vendors to focus on what it is that they do best, and that is creating top quality user interface components. Whether it be windowing systems, docking framework's, specialized editors or charting components; there is no longer any need for vendors to worry about integration with the underlying data source or on integrating their components with those used throughout the rest of the application.
What may seem to be ominously missing, is any mention of support for HTML-based front-ends. It was definitely something we considered, especially since that is what seems to be used by almost all so-called 'web' applications these days. After contemplating its fitness for purpose, however, we reached the general consensus that it is not something we will be supporting. While there is nothing stopping anybody else from attempting an HTML-based implementation, it is certainly not something that we will be doing, neither is it something that we would encourage – it just doesn't make sense.
The simple reason being that the HTML solution was never designed to be used to create user interfaces for dynamic applications. The HTML format itself was initially designed only as means for formatting research papers. To this day, using it to create anything other than a document-based layout will require using at least one or more hacks, which is not surprising considering that this wasn't one of the intended use-cases. The fact is, however, a document-based layout will never be sufficient for any user interface which exposes access to a rich data model.
The other issue with the HMTL solution is that it is built on top of an inherently stateless protocol. While this is completely adequate for a web of interlinked, static documents, the same cannot be said for one that is to support full-scale, dynamic applications.
Again, instead of recognizing that the HTML solution was inadequate, industry decided to work around its shortcomings.
The first problem to solve was that of a lack of persistent state. There have been multiple attempted workarounds for this problem, including hidden fields, cookies and server-side sessions. Of these, sessions have proved to be the most useful and prevalent, though even they are fraught with issues, as anyone who has come across the 'back button' problem can attest to. Not surprisingly, instead of taking this as a further sign of problems in the underlying system, workarounds have since emerged for this secondary problem as well.
The second problem to solve was that of a lack of a dynamic interface, a consequence of the static nature of HTML documents. Initially, this problem was simply ignored, requiring the user to wait for an entire page to reload each time some new information was required, no matter how little. In some quarters this problem is still ignored, with the reason given that increased network speeds have made this problem insignificant. This is, of course, unacceptable, as desktop applications which are 15 years old provide the user with a better experience than that.
Strangely, we believe that the proverbial screwdriver has, in this case, been with us all along, at least since the launch of the Java platform. The JVM plays perfect host to truly distributed, rich clients, which is not surprising given that this is what it was initially designed to do. It provides a well thought out security model, allows applications to make full use of all available processing power and provides real desktop integration support.
We believe that part of the reason for the JVM's slow uptake on the desktop is due to what can be considered deployment issues, while the other part is due to the inherent complexities involved in getting an object-oriented domain across a network. The latter will be completely eliminated by our framework. Deployment issues are admittedly still a bit of problem; especially noted have been the recent spate of JWS regressions. We do believe, however, that the applet model is beginning to show large amounts of promise, particularly for consumer facing applications. We believe that the functionality allowing users to drag applets on to the desktop will be extremely useful in weening them off their browser addictions. Combined with a suitable dynamic user interface framework, our data framework will allow the creation of fully sandboxed applications which stream their user interfaces, business logic and data from the server.
We will be shipping two server-based implementations in addition to our client-based implementation with the final, 1.0 release. There will be no significant differences between the interfaces of any of the implementations as, conceptually, there is no difference between pushing updates directly to a user interface component or indirectly, via the network; the message handling is identical.
We've decided that our flagship server implementation will be Scala-based, if for no other reason than for its conceptual elegance, which we find simply astounding. Additionally, it has clearly been designed to support both the use of an extended type system as well as an actor-based programming model. The latter will be a perfect match for our message-based architecture while the former will be used to increase the efficiency of our abstractions Our secondary server implementation will be a Java port of the Scala implementation and will make use of Akka's Active Objects for actor-based support.
In closing, I would like to clarify our stance with regard to the community.
We like to consider ourselves a community-friendly group of people and believe in the values of sharing technology. We will, therefore, be releasing the full source of all our implementations. We also have nothing against decompiling any of our implementations before their source is released.
We are, however, not a charity organization and do, therefore, require a stable revenue stream. After analyzing the models used by other community-friendly businesses, we have decided to settle for a more traditional approach.
We will not be selling support as it goes against our fundamental principle of providing the same top-quality service to all users of our framework. We will be taking all issues seriously and will not be releasing our final, 1.0 release until each individual issue has been resolved.
We will also not be offering any training sessions. We believe that if, after having worked through the provided documentation, specifications and samples, you still need help, then either you shouldn't be in software in the first place or our abstractions need work, quite possibly the latter. In any event, it is not a sustainable situation.
We will, therefore, be charging, commercial developers only, an annually renewable development fee of the equivalent of €900. The first renewal, however, will not be due until one year after the final release. There will be no deployment or royalty fees and there will be absolutely no costs for non-commercial developers.
We do anticipate that the megacorps will not be taking our announcement lightly. As a result, it has been deemed necessary to include certain terms in our license agreement in order to prevent the abuse of our intellectual property, even though they may seem unpalatable to some.
In terms of community involvement, we will be launching our forum shortly. It will be used as a means to gain feedback as well as to discuss all key decisions. The first discussion which we would like to extend to the community revolves around whether we should be supporting the deletion of entities, an issue which has yet to be resolved internally.
Finally, to reiterate why it is that we believe our solution will make make much of the middleware industry redundant; current solutions will never allow applications to be generated from the specifications and they do not, and can never, support live queries.
I will leave with a rebuttal to what can be an expected response – it just can't be that simple. Well, if you decide to play ice hockey wearing a pair of running shoes, of course it won't be.
Opinions expressed by DZone contributors are their own.