Managing Enterprise Integration Projects: Lessons Learned from the Trenches
Tips to properly plan, staff, and execute an Enterprise Integration project at scale.
Join the DZone community and get the full member experience.Join For Free
Enterprise integration has progressed from good old ESBs and messaging systems to modern-day solutions like service meshes. Hypes and technologies come and go, but the underlying practice of efficiently managing the lifecycle of an integration project remains the same.
In this article, I have put together the lessons I have learned the hard way from many integration projects as an integration consultant. Whether you are an architect or a developer, you may find this information useful when planning for a new integration project or upgrading the current one.
The Planning Stage
Don't Make Decisions Right After Watching Vendor Demos
When you are in the evaluation stage, you'll sit through many vendor presentations and demos. But don't judge any integration product based on that. An integration product may look good in demos, but it is your responsibility to make the final call by evaluating it against your actual production workloads.
Before making a decision, conduct PoCs with each vendor product to see how it behaves under traffic anticipated for 2-3 years. Also, consider the migration path and the support they provide if you are replacing an existing system.
Get the Team Composition Right
When planning for a new integration project, it is always better to hire people with the right skillset from day one. Today, many integration projects demand expertise beyond the integration middleware scope. DevOps, infrastructure, observability, databases, security, and programming are some top skills you should expect from a new hire.
For example, when your team is developing an integration, it is often required to reach out to other teams to get things done. You might want to consult a DBA to validate the DB schema, get help from Ops engineers to plan out the deployment, and get guidance from the QA team to design the performance testing scenario. Collaboration is good for the project. But if you depend on others too much, that will drag the development pace.
What if you have a team that possesses the expertise mentioned above? Then, your team will be self-sufficient to solve your own problems and move fast. Hence, it is critical to have a team full of diverse talent when it comes to planning, building, and managing integration projects.
Open Source or Commercial Vendors?
Ultimately, this decision boils down to two factors: time versus money. Which option do you think your organization mostly has?
Organizations with ample budgets spend heavily on commercial integration tools, support services, and quality talent. Their main intention is to complete the integration projects ASAP and hit the market. Time is critical for them — no matter how much it would cost them to build the project and support it.
On the flip side, there are organizations with constrained budgets and resources. However, they have plenty of time to experiment with open-source tools. They often support the product themselves and provide contributions to open-source communities.
When selecting integration vendors, you must think through these two dimensions.
The Implementation Stage
Get Your Integration DevOps Process Right
Traditionally, developers performed all the integrations, and they threw the final artifacts at operations to deploy them in production. The operations team had nightmares when they tried to deploy and troubleshoot them as they lack integration tool-specific knowledge.
The majority of integration middleware servers require a restart after deploying a new artifact. The server has to be taken out of the load balancer pool, deploy the artifact file, and add the server back to the pool. Most of the time, the operations team has to repeat the process across multiple servers to keep them synchronized and consistent. Altogether, a new artifact deployment is a time-consuming, error-prone, and manual process.
Imagine the pressure being placed on both developers and operations teams if there's a situation where you have to do multiple deployments in a day. This makes the entire development, test, and deploy cycles slow — taking weeks to deploy even a small fix for an integration.
This can be eliminated if integration developers have a robust and fast process to verify their changes locally and push them to production in a reliable manner. A well-established CI/CD pipeline will build the developer changes automatically, test them and finally deploy the build artifacts across multiple environments with minimal human intervention. That is scalable, efficient, and reliable — makes your developers and operations team happy.
Hence, think about establishing a proper DevOps process from day one to manage your integration development process.
Follow Proper Resiliency Patterns
You should not only look at the happy path when integrating two systems through an integration middleware. You can't assure the availability of source and target systems as they come and go without your control. But you have total control over how middleware behaves on a rainy day.
If the source system expects a response in a synchronous manner, try to utilize the reliability features that come with the middleware, such as retries and circuit breakers. For messages that require a reliable delivery, use asynchronous messaging instead of request-reply operations.
Most importantly, if you fail in the middle, don't do it silently. Do the necessary logging as much as possible and implement compensating transactions to guarantee consistency after a failure.
Properly Secure the Data in Motion
You are responsible for the data that flows through your integration middleware. It is always better to proactively secure the data in motion rather than performing damage control after a corporate data breach.
When receiving data from and sending data to an external system, use the transport layer or application-level security schemes supported by the middleware. Most of the tools today support standards like mutual TLS, OAuth 2.0, etc.
The Operations and Maintenance Stage
Get Your Observability Stack Right
You are responsible for delivering any message that arrives at the integration middleware to its final destination. That could go wrong in many ways. Perhaps the middleware failed to process the request, or the target system didn't respond. Or, the middleware didn't receive anything from the source system. How can you confidently say what really happened?
The observability tools come to help you at this point. Use distributed tracing tools to trace the end-to-end traversal of a message across systems. That way, you can spot the places where messages are lost. Jaeger is a good example of a distributed tracing tool.
Use log aggregation tools like Logstash, Fluentd, and GreyLog to ship middleware logs to a central location so that the log analysis can be done from a central place. Tools like ElasticSearch, Kibana, and Splunk provide rich log analysis support.
By enabling real-time telemetry on your server fleet, you can get notified about outages, servers under heavy load, and the overall health of the fleet. This helps the operations team to proactively attend to issues rather than waiting for a disaster.
Debugging Tools Are Friends of Your Team
Your team members shouldn't be playing out a distributed murder mystery game right after an incident happened in the system. There should be a proper set of debugging tools to isolate failures in systems.
Having tools that mock the source and target systems are critical in troubleshooting integration middleware in isolation. Apache JMeter, SoapUI, and Postman are few examples of such tools.
To quickly identify integration bottlenecks, your team members should also be familiar with skills like Java heap dump analysis and SQL query tracing.
Scale Proportionally to Your Source and Target Systems
When your upstream systems scale up and send more traffic in, the integration layer should also scale proportionally. Otherwise, there will be a performance bottleneck in the middle.
When sending traffic out to slow downstream systems, you should follow best practices to avoid them getting exhausted. For example, you can place a message queue in between the middleware and downstream system so that middleware can put messages there rather than sending them directly to the downstream system. That way, the queue acts like a buffer and absorbs sudden spikes in the incoming traffic. Also, you can consider throttling messages at the integration layer as a preventive measure.
Don't Be Afraid if You Are Still in The Brownfield
It doesn't matter whether you use cloud-native technologies, like Kubernetes and service meshes, or go with VMs and ESBs. The important thing is to start small, iterate faster, and learn from your mistakes.
When you want to send a message from system A to B through an ESB, you don't necessarily have to deploy everything on Kubernetes, at least for the first iteration. Start with the technology stack that you can afford to build and maintain in the longer run. You can embrace new trends as your integration project gains a solid footing within the organization.
Opinions expressed by DZone contributors are their own.