AIGenOps: Generative AI and Platform Engineering

In the domain of regulated software, especially in the banking environment where we work, there are constraints of security, quality, and network limitations.

Nicolas Fantoni

Riccardo Soro

Jul. 29, 24 · Opinion

Likes (6)

Comment

Save

3.0K Views

A While Ago...

We have been collaborating with a client in finance for some time now, and in a moment of relaxation, we started discussing generative artificial intelligence. And so, caught up in the excitement, as in a positive retroactive system, we began to sketch out the idea of how to integrate it and implement it in the real-world scenario in which we found ourselves.

Merging the LLM/AI skills and knowledge of a DevOps engineer with the vision of a platform engineer, we began to define the requirements, constraints, and loads of a real scenario in the area of regulated software and then define possible processes and solutions.

But in Which Context?

In the domain of regulated software, especially in the banking environment where we work, there are constraints of security, quality, and network limitations. Added to this are CI/CD loads that can be very high in numbers, already overloaded developers, and cost management. From here the list of initial requirements:

On-prem system or on managed VM or private cloud, potentially air-gapped
No performance drops in CI and CD pipelines
A zero-trust model with approval from the dev
Selection of components to be impacted
Limitation of generated objects

The last two points, rather than strong constraints are intended as common sense practices to better address the issues.

Which in Detail...

In the area of regulated software as well as limitations on network reachability (dealing with possible trade secrets) you do not want your data and code to be sent outside on unsecured or unverified systems. Therefore, the system should be hosted on private machines in well-segregated networks. Generative AI processes have a high impact on resource consumption, as well as can require high processing times. Therefore, to limit time and performance impact, they cannot be introduced into the CI/CD cycle: we then assumed an asynchronous and independent “continuous generative loop.”

As a system subject to certification and verification, and having to try to limit improper introductions, one must necessarily approach a zero-trust model, in which the “continuous generative loop” proposes pull requests (also referred to below as PRs) that a manager must validate and approve. With these assumptions, remembering that one of the principles behind platform engineering is “starting with the dev” and wanting to limit processing cost and time, one cannot generate thousands of lines of code in all applications. The generation part should then:

Select and Prioritize Only Those Applications on Which To Take Action

If, for example, there were 3 applications

One with coverage around 85% and a few code smells,
One with coverage around 80%, many code smells, and a few minor bugs
One with 60% coverage and critical vulnerabilities

The system should prioritize the last application and work on that one to equalize the overall level of the application pool.

Limit the Objects Generated

If we require a manager or developer to validate a pull request, in case it contains a massive amount of deliverables, the worst-case scenarios are that the PR is either rejected in its entirety or it is summarily checked, with the risk of introducing errors.

To make sure that the generation activity is in synergy with the day-to-day work of the devs, one has to act by selecting and prioritizing the activities, going for the few (hopefully!) most impactful bugs/vulnerabilities, or covering with tests the most uncovered class with the highest impact.

The selection and prioritization approach allows for faster processing, lower costs, and acting only on the applications that really need external help, but above all not impact the work of the developers.

What Next?

And then the next steps will be:

Define the application prioritization and selection algorithms
Define the selection and prioritization algorithms for quality/vulnerability resolutions and code coverage
Based on the principles of innovation management and platform engineering, identify early adopters, and pioneers to implement in a usable way a solution in a real-world environment, with the help of skilled developers that can collaborate on the optimal development for the end user

In Conclusion

It is possible to introduce generative AI into an IDP in a regulated context, respecting all the constraints and requirements of the environment, without neglecting the end user and his user experience with the system.

AI generative AI

Opinions expressed by DZone contributors are their own.

Related

Trending