Creating Value From Legacy Data – the Whys and the Hows
How can we ‘reactivate’ the legacy systems, and in doing so, not only put indeed all the available data to use but also do something good for our climate?
Join the DZone community and get the full member experience.Join For Free
While there are many challenges young companies might struggle with, they certainly escaped one that is a blessing and a curse at the same time – the legacy IT systems. Data is indeed one of the companies’ most valued assets, as knowledge (read, ‘data’) empowers better, more informed business decisions. Moreover, chances are your organization already has most of the knowledge it needs. The only caveat, though, is that it might be inaccessible and therefore, pretty useless.
In fact, according to various estimations, up to 97% of collected data is stored away and never gets to be used again, thus delivering zero value to the organization. Furthermore, a large portion of it comprises the so-called ‘dark data’ – data that is redundant, faulty, too old to hold any value at all, – or data that simply got forgotten.
And – here’s a surprising twist – this has an unpleasant effect on our climate. According to a recently published study by the American company Veritas Technologies, if we continue collecting and archiving all this ‘data waste’, we’ll end up with 5.8 million tons of CO2 being pumped into the atmosphere this year alone.
So, the main questions are, how can we ‘reactivate’ the legacy systems, and in doing so, not only put indeed all the available data to use but also do something good for our climate?
How Did We Get There?
Most organizations are fully aware of the fact that ‘oldie’ is sometimes not ‘goldie’. In today’s world, companies are increasingly relying on digital technologies to develop new products, increase operational efficiency, reduce costs, and – most importantly – ensure that customers have a satisfying experience.
So, how come that many companies still end up with sometimes decades-old systems that either run in parallel or are simply redundant, thereby driving up costs for maintenance and potentially, posing security risks and compliance issues?
Sometimes, a not entirely thought-through data migration strategy is the culprit. A company decides to replace one mission-critical piece of technology with a better one. To avoid disruption to the business operation, it lets both systems run in parallel “for a limited period” until data migration is over. Only to find out that not all data can be migrated as it is due to, for example, outdated and incompatible formats. Or the business logic of the legacy system wasn’t understood as well as it had been assumed, and the cumbersome source code is the only documentation the IT team has.
Another reason could be that the legacy system is – in the truest sense of the word – a legacy from a merger and acquisition. It’s a common issue when an acquired company brings into “the new relationship” outdated, inefficient, or redundant systems. Typically, both sides are aware of this and try to address the issue already during the pre-acquisition diligence process. However, often, this results in devising patches and workarounds that would merely help overcome the limitations of such systems. Consequently, instead of solving the issue efficiently and in the long term, companies often end up with incompatibilities among individual layers of the technology stack.
In the end, whether due to mergers and acquisitions, unforeseen issues during software replacement, or simply as a consequence of organic growth, the result is the same. Potentially valuable data – hence, knowledge – is locked away and inaccessible for use.
In an Ideal World… Well, There Is No Such Thing
In an ideal world, you would have a sure-fire blueprint for how to avoid all those pitfalls. A detailed strategy with a checklist of things that need to be accounted for and taken into consideration, and several successful exit scenarios. But, well, we are not in an ideal world. So, instead of trying to figure out how to avoid ending up with legacy systems, let’s consider how we can make the best out of having them.
‘Legacy’ is by no means equal to ‘bad’. Although some will argue that ‘legacy’ is tantamount to ‘old’, in fact, it just means that it has been superseded by an application (or applications) that is faster, more efficient and/or can better address a business’s challenges, and yet, is difficult to replace due to certain reasons.
Legacy systems may contain historical data that is extremely valuable for smart analytics; for example, to be able to predict future purchase trends based on the developments in the purchase behavior over the past years or maybe, even past decades. Or they may contain data that is incomplete or inconsistent, but when enriched with third-party data sets, can provide extreme value, for example, for training AI models.
The key factors for making use of such data is accessibility and availability. This means mainly two things: Making sure that a) legacy systems are compatible with newer systems and b) data formats are consistent across the applications – or at least can be effectively translated between them. This demands a clear, thought-through, and well-structured integration strategy to design pathways that would connect legacy and modern applications and systems.
Legacy System Integration — Bringing Data Out of the Dark Side
Having an integration strategy in place might seem like an evident answer. ‘Well, Captain Obvious, what are you going to preach next?’. But. Let’s go through some statistics. As stated by an Accenture research [PDF] in the Financial Services area, 44% of financial services firms named difficulty integrating new technology into existing structures as one of the biggest obstacles to digital transformation in their sector.
In mergers and acquisitions, the lack of necessary integration expertise for M&A transactions is cited by many analysts and experts as a notoriously common issue. According to various studies, between 60-85% of companies name missing communication points between legacy and modern applications as one of the main issues that are hampering their digital innovation initiatives.
Workarounds and quickly hacked ‘communication bridges’ can help in the short term, but they act more as Band-Aids than actual solutions.
Instead, integration can be achieved through APIs (application programming interfaces) that would expose data in legacy systems and/or by introducing an integration layer through a middleware across the entire infrastructure.
Yes, designing APIs from scratch for a system that was built in the worst-case scenario in pre-API times will be tough, time-intensive, and budget-heavy. But it might be very well worth it, especially considering that such APIs can be used equally well for API-led connectivity as well as for an integration platform layer.
As a standalone strategy, an integration platform layer would typically provide connectivity capabilities not only through APIs but also through a variety of different protocols. Such a layer offers a high degree of standardization, thereby lowering the level of resources and expertise required to create availability and accessibility across legacy and modern data systems.
Either approach will improve the compatibility of legacy systems with modern applications. An integration platform layer can, additionally, help with the transformation of different data formats. This will help bring together previously disconnected data sources to provide context for data-driven analysis. Moreover, it will result in numerous opportunities for getting the most benefit out of data from legacy systems. For example, being now able to clean, enrich, and structure it for further use.
The ultimate goal is to eliminate data waste – whether for business purposes or climate benefits or both. This means ensuring that all available data is accounted for and if no longer necessary – disposed of. And a clear legacy system integration strategy is one of the most essential elements of that.
Published at DZone with permission of Olga Annenko, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Seven Steps To Deploy Kedro Pipelines on Amazon EMR
Replacing Apache Hive, Elasticsearch, and PostgreSQL With Apache Doris
Introduction To Git
Health Check Response Format for HTTP APIs