Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Don’t Let Your DataMapper Streaming be Out of Control

DZone's Guide to

Don’t Let Your DataMapper Streaming be Out of Control

· Integration Zone ·
Free Resource

SnapLogic is the leading self-service enterprise-grade integration platform. Download the 2018 GartnerMagic Quadrant for Enterprise iPaaS or play around on the platform, risk free, for 30 days.

Originally authored by Mariano Simone

As Nicolas pointed out in “7 things you didn’t know about DataMapper“, it’s not a trivial task to map a big file to some other data structure without eating up a lot of memory.

If the input is large enough, you might run out of memory: either while mapping or while processing the data in your flow-ref

Enabling the “streaming” function in DataMapper makes this a lot easier (and efficient!).

just enable "streaming"

But just doing this doesn’t let you decide how many records at a time you want to get passed on to the next processor: in the worst-case-scenario, you might end up with just one line at a type processing. If your next processor is a database, you will have as many queries as lines in your file.

There is, however, a little trick to gain fine-grained control on how many lines are being processed: setting the batchSize property of the foreach:

<foreach batchSize="100" doc:name="For Each">
    <logger message="Got a bunch! (of #[payload.size()])" level="INFO" doc:name="Logger"/>
</foreach>

If you want to see this in action, go grab the example app, import it into studio and start playing around ;)


With SnapLogic’s integration platform you can save millions of dollars, increase integrator productivity by 5X, and reduce integration time to value by 90%. Sign up for our risk-free 30-day trial!

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}