Over a million developers have joined DZone.

The Missing Link of Software Engineering

A high level look at how natural language differs from the specifics of object oriented programming.

· Java Zone

Check out this 8-step guide to see how you can increase your productivity by skipping slow application redeploys and by implementing application profiling, as you code! Brought to you in partnership with ZeroTurnaround.

One of the major challenges of Software Engineering is the numerous information gaps between different activities around it (requirements, design, algorithm, user interface, testing, bug tracking, documentation, etc). Can we easily check if all requirements are covered by a design? Can we identify if all design points are covered by an implementation or by user interface controls, or all requirement items by testing and documentation? Sometimes restrictions are found during coding and testing, so they should be considered as a part of requirements. Or imagine coverage of all options to generate all possible use-cases for test coverage. Wouldn't it be great if we could control all this and use it with Continuous Integration?

Why can't we fill these gaps today? Well, actually we do but only in our minds. What are the obstacles for doing this at least in semi-automatic mode? The most probable cause is our belief that meaning cannot be extracted from ambiguous natural language and partially it is enchantment by natural language itself. Concerning the first point, we do agree this is a difficult task but why can't we facilitate it by having some interim forms between natural language and formal data (and metadata)? The second point could be more tough but try to accept it: natural language forms are not ideal (though convenient) representation of meaning, it is just one of many others.

We are too accustomed to conventions of natural language, and this creates most problems with meaning. We have conventions: we must have subject-predicate-object, we divide words/conceptions into nouns-verbs-adjectives-adverbs, etc. But meaning could concern only static objects, or mostly dynamic behavior, we don't need to have a subject or a predicate in each meaning "statement". Generally speaking, most of verbs are complexes of object-actions (which reflects their dualism in space-time). We don't notice "I go home" today may involve not only us but also a vehicle, a road, and wind (or winding). Knowledge of what are subject-predicate-noun-verb in this phrase does not help to understand what this phrase means. Identification and abstraction does.

In "I go home" there are at least three participants, which can be identified with certain level of abstraction. "I" can be identified as "John Doe", which could be enough if no other Johns Doe are known. "go" can be identified as just "moving" from "here" to "home". In the simplest case, it is enough just knowing about similarity between "go" and "moving" (not "becoming"). "home" can be identified as "Main St. 1, Springfield" if it is unique enough for us. That is, we try to identify what this information is similar to, is different from, is included to, and includes. And this is can be expressed with "has" and "is" relations. "Springfield has Main Street, 1, which is our home" tells us more than "home is a noun".

Yes, you understand correctly. Natural language is more meaningful when it uses good old "is" and "has" relations from object-oriented design. But they are not so formal and so strict as in programming. You can use "is" when comparing anything with anything. You can use "has" relation to have any level of details you want. Understanding this leads us to one conclusion: as both natural language and strict computer formats use similar relations, a bridge between them is possible. Such bridge may have a form of markup, which uses natural language words and a set of basic relations, and which will be understood by an ordinary user. If we will use curly brackets it may look like "Springfield {has} Main Street", "Main Street, 1 {is} home {of} John Doe". This form is only slightly different from natural language: (a) meaningful identifiers have explicit boundaries, and (b) basic relations are inside markup. But it allows us to convert natural language sentences into stricter data and vice versa with an acceptable level of identification and abstraction depending on our needs.

Let us return to Software Engineering and applications. What is the meaning behind an application? It has different layers: a domain, which an application covers, server-side code, database one, GUI, Web UI, etc. Each layer is a complex of objects-actions, conditions, cause-effects, etc, which are linked with different kinds of "is" and "has/of" relations. Also these layers have intersection points. In the simplest case, you can consider only objects which are linked with "is" and "has/of" relations and not overlapped layers. Imagine we have an application, which has the requirement of "show and animate the current position of planets in the Solar System". A domain consists of "planet" and "planetary system" objects, which are linked as "planetary system {has} planet", "Solar System {is} planetary system", "Mercury {is} planet", etc. User interface has "Planetary view", "Menu", which, in own turn, has "Animate planetary view" and "Exit" items. And in terms of algorithm the application has "Read data from database", "Write data to database", "Calculate position", "Display planetary view", "Animate planets" parts.

"Show the current position of planets in the Solar System" requirement linked with "planet" and "planetary system" domain objects, "Planetary view" user interface object, and "Display planetary view" part of algorithm. "Animate planets" requirement relates to "planet" and "planetary system" domain objects, "Planetary view" and "Animate planetary view" user interface objects, and "Display planetary view" and "Animate planets" parts of algorithm. This is not really groundbreaking, yet, you could use such strings as tags in, say, a bug tracking system. But what tags miss are relations between them. For example, the first requirement relates not only to "Planetary view" control but also to any control, which is included into this view, so to cover this requirement we need to have tests/documentation for all controls too. The second requirement concerns "Animate planetary view" menu item, which is included into a menu, so, if some change relates to the whole menu, then this menu item may be affected too. If we have an option for enabling/disabling animation, then the second requirement consists of at least two alternative use-case, which can be inferred according with this option and which are to be covered by tests. As clicking on "Animate planetary view" menu item causes animation, we have a simple cause-effect chain (or UI path to this function), which can be reflected in tests and documentation appropriately.

As you can see, meaningful markup allows bottom-up and top-down movement from natural language to code. What is more important such movement can be formalized and the challenge stated in the beginning of the article can be accepted. Slightly lengthier explanation is available in https://github.com/meaningfuljs/meaningfuljs/blob/master/doc/the-missing-link.md.

The Java Zone is brought to you in partnership with ZeroTurnaround. Check out this 8-step guide to see how you can increase your productivity by skipping slow application redeploys and by implementing application profiling, as you code!

software engineering,semantics,continuous integratinon,object-oriented

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}