As a recovering “design by committee” offender I have to be careful when lurking near standards groups mailing list, for fear my instincts may take over and I might join the fray. But tonight a few tweets containing alluring words like “header” and “metadata” got the better of me and sent me plowing through a long and heated discussion thread in the OGF OCCI mailing list archive.
I found the discussion fascinating, both from a technical perspective and a theatrical perspective.
Technically, the discussion is about whether to use HTTP headers to carry “metadata” (by which I think they mean everything that’s not part of the business payload, e.g. an OVF document or other domain-specific payload). I don’t have enough context on the specific proposal to care to express my opinion on its merits, but what I find very interesting is that this shines another light on the age-old issue of how to carry non-payload info when designing a protocol. Whatever you call these data fields, you have to specify (by decreasing order of architectural importance):
- How you deal with unknown fields: mustUnderstand or mustIgnore semantics.
- How you keep them apart (prevent two people defining fields by the same name, telling different versions apart).
- How you parse their content (and are they all parsed in the same manner or is it specific to each field).
- Where they go.
SOAP provides one set of answers.
- They go at the top of the XML doc, in a section called the SOAP header.
- They are XML-formatted.
- They are namespace-qualified.
- You can tag each one with a mustUnderstand attribute to force any consumer who doesn’t understand them to fault.
You may agree or not with the approach SOAP took, but it’s important to realize that at its core SOAP is just this: the answer (in the form of the SOAP processing model) to these simple questions (here is more about the SOAP processing model and the abuses it has suffered if you’re interested). WSDL is something else. The WS-* stack is also something else. It’s probably too late to rescue SOAP from these associations, but I wanted to point this out for the record.
Whatever you answer to the four “non-payload data fields” questions above, there are many practical concerns that you have to consider when validating your proposal. They may not all be relevant to your use case, but then explicitly decide that they are not. They are things like:
- Ability to process in a stream-based system
- Ease of development (tool support, runtime accessibility…)
- Ease of debugging
- Field length limitations
- Ability to structure the data in the fields
- Ability to use different transports (way overplayed in SOAP, but not totally irrelevant either)
- Ability to survive intermediaries / proxies
Now leaving the technology aside, this OCCI email thread is also interesting from a human and organizational perspective. Another take on the good old Commedia dell standarte. Again, I don’t have enough context in the history of this specific group to have an opinion about the dynamics. I’ll just say that things are a bit more “free-flowing” than when people like my friend Dave Snelling were in charge in OGF. In any case, it’s great that the debate is taking place in public. If it had been a closed discussion they probably would not have benefited from Tim Bray dropping in to share his experience. On the plus side, they would have avoided my pontifications…