Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Parse Multi-Segment Flat Files in MuleESB

DZone's Guide to

Parse Multi-Segment Flat Files in MuleESB

How do you parse multi-segment flat files with MuleSoft? Read this article to find out.

· Integration Zone ·
Free Resource

SnapLogic is the leading self-service enterprise-grade integration platform. Download the 2018 GartnerMagic Quadrant for Enterprise iPaaS or play around on the platform, risk free, for 30 days.

Nowadays, most of the systems are transforming data in the form of XML, JSON, etc, but legacy applications/systems are still using Flat files to exchange data between the systems.

A Flat File can contain multiple lines of data with each line containing different line identifiers. Each line can have many fields of data, separated by a special character like Comma, Pipe, and Fixed width.

How to Parse Multi-Segment Flat Files With MuleSoft?

If you are working with multiple types of records in the file to transform, then you'll need to use a structure definition that controls how these different records are combined in a file. To read a multi-segment flat file in Mule, Flat file schema will play a key role in it. Preparing flat file schema needs to follow certain rules, and Mule has some limitations.

Keywords in a Flat File Schema:

Group: Several segments are grouped together. A group segment can also include child segments.

Segment: Segment is a line of data that has any number of elements.

Element: Element is a data item, which is associated with datatype and value.

Structure: Structure requires identifying the segments, and the segments have unique identifier codes/tags to identify the data.

ID: Structure identifier.

Name: Structure name.

tagStart: Starting number for segment identifier and required parameter for flat file schema. MuleSoft supports only the value 0.

tagLength: Number of columns in segment identifier tags (all the segments should be the same in length).

Data: List of the segments and groups in the structure.

Flat File Schema:

form: FLATFILE
structures:
- id: 'MultiSegment'
  name: MultiSegment
  tagStart: 0
  tagLength: 7
  data:
  - groupId: 'Data'
    items:
    - groupId: 'Flightsdata'
      count: '>1'
      items:
        - { idRef: 'flight', count: 1 }
        - { idRef: 'Account', count: '>1' }
segments:
- id: 'flight'
  name: flight
  tag: 'flights'
  values:
  - { name: 'cid', type: String, length: 10 }
  - { name: 'airlineName', type: String, length: 15 }
  - { name: 'price', type: String, length: 15 }
  - { name: 'departureDate', type: String, length: 15 }
  - { name: 'planeType', type: String, length: 15 }
  - { name: 'origination', type: String, length: 15 }
  - { name: 'code', type: String, length: 15 }
  - { name: 'emptySeats', type: String, length: 15 }
  - { name: 'destination', type: String, length: 15 }
- id: 'Account'
  name: Account
  tag: 'Account'
  values:
  - { name: 'Billing_Street', type: String, length: 30 }
  - { name: 'Billing_City', type: String, length: 15 }
  - { name: 'Billing_Country', type: String, length: 15 }
  - { name: 'Billing_State', type: String, length: 13 }
  - { name: 'Name', type: String, length: 30 }
  - { name: 'BillingPostalCode', type: String, length: 15 }

By default, all the defined segments are mandatory. If you want to make it as an Optional, then use "Usage code" property.
Ex: { idRef: 'Accounts', usage: O }

Mule Flow:

Image title

Configuration XML:

<flow name="multi-segment-flatfileFlow">
<file:inbound-endpoint path="src/main/resources/Input" moveToDirectory="src/main/resources/Archive" responseTimeout="10000" doc:name="Read files from Input Directory"/>
<dw:transform-message doc:name="Transform Message" metadata:id="5b9b30a6-f75b-4b8b-b297-e35026bfaed8">
<dw:input-payload mimeType="text/plain">
<dw:reader-property name="schemaPath" value="Multi-structure-segment-schema.ffd"/>
<dw:reader-property name="structureIdent" value="MultiSegment"/>
</dw:input-payload>
<dw:set-payload><![CDATA[%dw 1.0
%output application/xml skipNullOn="everywhere"
---
{
flightsdata: {
flights: {
flight: {
cid: payload.Data.flight.cid,
airlineName: payload.Data.flight.airlineName,
price: payload.Data.flight.price,
departureDate: payload.Data.flight.departureDate,
planeType: payload.Data.flight.planeType,
origination: payload.Data.flight.origination,
code: payload.Data.flight.code,
emptySeats: payload.Data.flight.emptySeats,
destination: payload.Data.flight.destination
},
Accounts: {
(payload.Data.Account map ((account , indexOfAccount) -> {
Account: {
Billing_Street: account.Billing_Street,
Billing_City: account.Billing_City,
Billing_Country: account.Billing_Country,
Billing_State: account.Billing_State,
Name: account.Name,
BillingPostalCode: account.BillingPostalCode
}
}))
}
}
}
}]]></dw:set-payload>
</dw:transform-message>

<file:outbound-endpoint path="src/main/resources/Output" outputPattern="#[org.mule.util.StringUtils.substringBefore(message.inboundProperties.originalFilename, '.')+'.xml']" responseTimeout="10000" doc:name="Write File in Output Directory"/>
</flow>

With SnapLogic’s integration platform you can save millions of dollars, increase integrator productivity by 5X, and reduce integration time to value by 90%. Sign up for our risk-free 30-day trial!

Topics:
mule 3.8 ,dataweave ,segments ,schema ,mulesoft ,muleesb ,parsing ,parse

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}