Validating EDI Data in Java
This article reveals how the EDI validator notifies an application about validation events, and dives into EDI standards and implementations.
Join the DZone community and get the full member experience.Join For Free
One of the most common requirements when dealing with EDI data is the need to validate messages. Last time, we looked at reading EDI data in Java which included a basic example of validating an X12 acknowledgment message. In that article, a sample schema (a set of validation rules) was given along with some basic Java code using the StAEDI library to set up the validator. This time, let's dig a little deeper into how the validator notifies an application about validation events and also discuss the differences between EDI standards and implementations — and how to validate both.
EDI Event Streams with Errors
While reading through an EDI message using the
EDIStreamReader in StAEDI, the various structures found in the data are reported to an application as events. When a schema has been provided and the EDI data does not match the schema in some way, the invalid segments and elements are also reported as events as they are found in the data.
Let's look at a contrived schema to understand how the validation rules are declared. In this example, the transaction structure declares that any EDI message conforming to the schema must begin with a segment
SAA, contain up to five occurrences of a loop starting with segment
S11, and finally, have a segment
SZZ. Note that the schema does not mention the message header and trailer segments (
SE for X12 or
UNT for EDIFACT). Those segments are handled separately by the parser and should not be in the message schema.
For the purposes of demonstrating the structure of the transaction's schema, all segments are composed of the same two element types. The elements themselves have length requirements that any conforming transaction must meet.
A sample message using this schema might look like the following. This example shows multiple occurrences of the schema's
L0000 loop (starting with segment
S11 ). The first occurrence contains two
S12 segments whereas the second contains none —
S12 segments are optional (
minOccurs is 0 by default).
There is something wrong here, however. Did you notice that the second occurrence of
S12 contains an element that is too short? The value
2 does not meet the schema's required minimum length of 2 characters. Now let's take a look at how a Java program would receive the events from this simple message. The following code snippet skips over the envelope segments and jumps straight to the segments from the example message.
As can be seen in this example, the EDI stream is received by an application as a series of events. The events that are only available when a schema has been configured are those dealing with loop boundaries and errors. In this case, we can see that the
S11 segments initiate a
START_LOOP event and also that the data error that was noted earlier (where the segment
S12*2~ contained an element shorter than the requirement) resulted in the
ELEMENT_DATA_ERROR event coupled with the
DATA_ELEMENT_TOO_SHORT error type.
Standards Versus Implementations
The schema XML above is an example of a standard schema. Standard schemas are the rules published by standards bodies such as ANSI (X12) or the UN (EDIFACT). In the example, the
transaction element is used in the XML to identify the standard message structure. Additionally all of the "type" elements are used to identify the standard segment, composite element, and simple element structure and requirements.
Most EDI exchanges go beyond the standards, however. Business partners often define how the standard must be structured for their particular industry or use case. This is when an implementation schema becomes useful. Implementations allow for the further refinement of the rules and also allows for loops and segments to carry different data. Implementations of loops and segments may include some or all of the components defined by the standard, but must always adhere to the standard.
We can now extend the schema from the earlier example. Below, only the
implementation elements are shown along with their sub-elements. All of the types used in the standard example above are implied (
segmentType XML elements).
In this implementation example, we can see a several things.
- The implementation restricts the number of occurrences of the
SAAsegment to one.
- There are two "types" of the
0000B. Each type is identifier by the value of the first element of the first segment,
S11. When element S1101 is
X1, the rules for loop
0000Aapply and when S1101 is
0000Bapply. The value of the
discriminatorattribute on the loop indicates which element of the loop's first segment contains the enumerated values used to identify that instance of the standard loop.
- Note the differences between the two occurrences of the loop. In the "A" type, element S1102 is forbidden (not used) and the occurrences of segment
S12are limited to two (three fewer than allowed in the standard). In the "B" type, S1102 is allowed, but the
S12segment is omitted and therefore not allowed.
Validation of EDI data can be a complicated task — especially when developers need to check both the standard rules as well as the rules specific to a particular industry or trading partner. Using the validation features in the StAEDI Java library, this task becomes a little bit simpler.
Opinions expressed by DZone contributors are their own.