Over a million developers have joined DZone.

How to Plan and Setup System Data Integrations in 5 Easy Steps

Integrating data into a iPaaS cloud database can sound like a scary process, however it really comes down to these five simple steps.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Integrating data with an integration-platform-as-as-service (iPaaS) cloud database can sound like a scary process, however it really comes down to five simple steps.  

Business owners who may be hesitant to make the transition into big data and the improved performance that comes with that should fear not - it's easier and less risky than ever to use a data integration platform.

Step 1:  Determine how your data should sync

Before you setup an integration, you need to be clear about what you expect to gain from it, as well as what data you have consistent access too.

Commonly, this data includes marketing (lead) data, sales opportunity data, finance invoices and even things like employee wages and user information.

A good practice that we recommend is to create a spreadsheet and list out 3 main columns:

  • System to integrate (Zoho, Salesforce.com, HubSpot, Marketo, etc...)
  • Object the data lives in (lead, account, opportunity, etc...)
  • Fields to sync (where the data lives)

Grab our spreadsheet template here if you want a head start.  Keep in mind that many data integration apps will let you do this data mapping right in their applications.

Once you have your integration planned out in terms of how you want your data to sync, you can move onto actually making it happen.

Step 2:  Inputting your data into an integration system (choosing the right system for you)

Once you have determined essentially how you want your business data to sync, you can now set the criteria for data, choose your platform, and enable the integration.

When choosing a platform to use, there are certain features to pay attention to, including price, bi-directional syncing (does the data flow into and out of each system?), number of system you can connect and how many records you're limited to sync in a given time.

We'd like to think that our system works for businesses of all sizes (feel free to try it for free), but if you need a simple, one-way integration there are other systems out there that may work for you as well.

Step 3:  Map your systems, objects and fields

Once you have determined the data that you want to integrate, you can map the fields across each platform to connect seamlessly.  Check out our support site for an easy tutorial on data mapping.


When mapping your fields, keep in mind that in most platforms, data will sync to and from a particular system.  This "bi-directional" syncing is a powerful feature, because it will keep your data in sync as it changes.  In this scenario though, you may want some system's data not to be overwritten when an update happens.  In this case you should ensure that the platform can handle this.

Step 4:  Refine your integration by setting up filters

Once integrated, the initial transition can be overwhelming. At first, it can seem like an explosion of information.  When you see the sales performance across multiple branches or departments within your organization, how will you know what is relevant data that can improve company productivity, and what is just a distraction?


Filters help to narrow down which data should be synced and provides a way for you to make sure that bad or unwanted data doesn't sync.

Imagine a sales manager who depends on leads from the marketing team for her sales reps.  Before an integration filter was put in place, reps were complaining about lead quality, and only wanted leads that had either signed up for a free trial of their product, or downloaded a certain whitepaper.  A filter allowed the sales manager to only sync "qualified" leads to her reps, which resulted in better productivity and more closed deals.

Step 5: Start your integration - sync historical data or start fresh?

At this point, you've chosen your platform, mapped your fields and setup filters.  Awesome, you're ready to go!  One last thing to consider is whether you want to sync historical data between the systems that you're integrating, or start you integration from scratch (sync on a "go-forward" basis).  

Many integrations platforms store your data in their database, which means that they can sync a "cleansed" copy of your data into a particular system.  This is immensely helpful, as it will prevent massive exporting, data scrubbing and re-importing of data between systems.

Your other option is just to sync on a "go-forward" basis, which means that any new or changed records will sync, but all of your existing data may not.

Conclusion: Quick wins

By integrating data across multiple platforms, an organization that is new to the data management process can quickly gain value from their integration efforts, sometimes in as little as a few days.

Other companies that have invested years of resources into old-school, hard to use integrations can create relevant fields and tune systems to utilize their exiting analytical data as a short-cut to a working big-data system.

Whether you are an established business or a start-up, integrating your data does not have to be a huge task. With these 5 steps, you can be on your way to maximizing the power of your data.

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.

bigdata,etl,data integration,data integration software,administrators,death of etl,business operations,big data integration

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}