DevOps vs. “Traditional” CI/CD
DevOps implies changes in behaviors (cultural changes), while CI/CD is about tools and mechanisms.
Join the DZone community and get the full member experience.
Join For FreeWhy should such a distinction need to be made?
- Because, frequently enough, we hear Clients claim that they “are already using modern pipelines” but are frustrated to no end!
- To educate re: how Meta, Amazon, Netflix, Uber, etc.. (who publicly share a lot) are doing things VERY differently, with enormously different outcomes.
- To help enterprises fully understand that
DevOps/DevSecOps
implies "Organizational Change" or, in simpler terms, Changes in Behaviors.
___________________________________________________________________________________
DevOps is primarily about cultural changes towards relentlessly avoiding manual steps, relentlessly decreasing manually-introduced errors, and in increasing efficiency, speed, quality, and automation.
- In short, relentlessly “Shift Left.”
For those who enjoy working with a Maturity-Model view of this same topic .. take a look at CMU’s Software Engineering Institute’s Blog at: https://insights.sei.cmu.edu/blog/a-framework-for-devsecops-evolution-and-achieving-continuous-integrationcontinuous-delivery-cicd-capabilities/
___________________________________________________________________________________
Do you prefer to focus on Oucome-based differences?
Sorry, that is well documented (google's search results are full of this; All vendors' websites are full of such details), so not covered here.
___________________________________________________________________________________
Instead... let’s aim to understand HOW does DevOps/DevSecOps
differ from what CI/CD
is.
Let's deep-dive along the following four dimensions:
- Organizational and Financial
- Quality
- Accelerate
- Security
____________________________________________________________________________________
1) Organizational and Financial Dimensions:
- Industry-wide experience is that
DevOps/DevSecOps
requires a specialized skillset and a dedicated resource/capacity (that is, a "new" investment is needed) towards achieving business/corporate outcomes and objectives, which will not be listed here. - Testing-team is now a top-tier team with a highly valued skillset. Cognizant calls them “QE = Quality Engineering”
- Testing-team will provide inputs towards defining multiple-VARIATIONS of a solution’s Pipeline — so that each variation will address Risks (w.r.t. Quality) differently.
- Example: For a time-critical upcoming demo of a single new feature to business leadership, why bother testing everything? Focus JUST on testing only those use cases related to that new feature. What's the benefit? Faster development cycle, allowing more sprints till the "last moment" before the demo.
- One feature I've used in such a scenario: The
JUNIT-XML
compatible testing reports generated byAWS CodeBuild
are evaluated by aStepFunction Lambda-Task
, to determine if the "important" use-cases testing was successful and the rest of the "failed" tests were ignored, thereby declaring the pipeline/build as successful.
- Use of “Feature-Flags” (perhaps not suitable for LifeSciences Clinical-Ops systems & highly regulated solutions)
- There are No Silos: Previously, Developer-teams separate from the Testing-teams separate, Infrastructure-team separate from Operations-team separate from Security-Team, etc.
- Not all industries can fully achieve the motto “You Build it, You pick up the call at 4 am"
- But, still, try to see how far you can go with this; Obviously, do this in collaboration with any regulators as well as corporate Quality-Assurance teams.
- If a developer hates being frequently woken up at 4 am, you know that the quality jumps up by a magnitude just because of this single line item.
- On-demand Environments for branching are created as an integral part of the Pipeline.
- This enables a new capability to "try out something" (a.k.a. experiment) for what will be relatively a small additional infrastructure cost to the project.
- Unfortunately, this also causes uncertainty in estimating Monthly & Quarterly infrastructure costs. ALERT! This aspect is a major disruptor!!
- Worse, it requires someone (non-technical employee/role) to now spend a significant amount of time monitoring trends and providing summary reports to senior management (if not to Finance-department directly)
- Any Volunteers to do the infra-costs Forecasting (look-ahead estimates), and then to defend yourself as to why your forecasts were so BAD !?!?!?
- And that, too, so bad so frequently?!?
- This will likely require additional License/Subscription-costs for tools like "Cloudability," etc... to demonstrate to senior management that the enterprise will have better oversight and better controls over costs.
- I will add a new trend that is arising from a "pay-per-use" mentality: Some of the Non-Production environments (for example, UAT, Training...) should be shut down/destroyed by default and be brought back online on-demand only. For that to be viable, all environments should take < 30 minutes to be fully up and running!
- This requires very advanced automation expertise, very similar to Cloud-native Disaster-Recovery activities.
- Initially, as this automation matures, you will have a frustrated user base. Learn to manage that! There are a lot of cost savings here!
___________________________________________________________________________________
2) Quality
- Avoiding mandatory human code reviews as a requirement for kicking off the pipeline.
- Instead, rely heavily on tools (Code Quality as well as Security).
- Just like Regulatory-agency Auditors, all code reviews by humans should be done on specific portions of code base that are picked solely based on instinct and experience and should be done “a-synchronously.”
- All environments should be identical (except perhaps DEV-environment), But even then, DEV-environment should — as needed — be able to be scaled up to be identical to Production.
- Everyone’s Objective: No builds/pipelines should break.
- If automated-build breaks, it's an “all-hands-on-deck” situation (everyone stops their work) until builds are smoothly running again.
- FYI: I'm told that this "collective punishment" works great. Cue the scenes from the movie "Platoon" (about US War in Vietnam).
- YOU cannot go home if YOU break the build/pipeline.
- Do you have ANY idea how you WILL enforce the above? Or at least ensure the performance review impacts the person who breaks, builds, and goes home?
- No “fixing” will ever be done in TEST, SIT, QA, or UAT.
- Nor in Production, of course.
- The ONLY exception: is when a Formal SRE function exists, Not the
Problem-Managementfunction, But an SRE function that is a technically-skilled team.
- The ONLY exception: is when a Formal SRE function exists, Not the
- Everything should go back all the way to DEV-Environment.
- Replicate the issue as the 1st step.
- Always replicate any issue (whether PROD or NON-Prod) within the DEV environment.
- QE then assists (with both Unit-testing and others) to ensure it will not happen EVER again.
- Of course, access to PROD & UAT logs — for read-only purposes — will be enabled/provided as per need and by the process.
- Nor in Production, of course.
__________________________________________________________________________________
3) Accelerate
- Developers MUST be checking-in code AT LEAST once per hour. If not much more frequently.
- Non-stop automated builds & automated testing covering unit, regression, and end-to-end.
- No Manual builds, even for Production.
- Unit-testing scripts should be identical both on Developer-laptop as well as in Pipeline.
- Enables/Empowers (Not as a requirement) use of multiple languages, platforms, and frameworks — instead of rigorous/stringent rules for how software is built.
- DevSecOps is about one single fully-unified pipeline (not 3 separate: one for CI and another for Deployment, and a 3rd one for Security)
___________________________________________________________________________________
4) Security
- The entire organization should have CONSENSUS when answering the philosophical question: If there's any non-compliance or any deviation, do you just "send alerts," ... or should you "auto-remediate"?
- A more "tangible" variation of the above question is:- Should "
AWS Config / Amazon Inspector / Amazon Macie
" send alerts, or .. should they trigger aLambda/StepFunction
to 'destroy/jail' things? - Yes, answers may be different by Business-Unit, and different by Criticality-Tier of systems, etc.; But everyone (incl. business teams) should be clear about the answer. 100% clear. 100% aligned.
- Warning: Any enterprise that lazily chooses "Send alerts" as a catch-all answer is condemned to throw away all their investment into DevOps and WILL suffer attrition of their best talent.
- A more "tangible" variation of the above question is:- Should "
- Clearly documented ownership/responsibility and a well-defined process must exist to ensure Container-images are always “up to date” (w.r.t. patches).
- Clearly documented Automated-Process must exist — to AUTOMATICALLY trigger the Pipeline to update the deployed application with the latest Container-images.
- Any deployment to UAT and PRODcan, of course, naturally requires a FORMAL & manual change-request/change process.
- This is where... “ServiceNow-integrated Automation” is now becoming the new best-practice trend!
- It replaces the manual processes and manual steps, except for one step: The "Approval"!
- Periodically, rebuild by using the latest versions of Libraries/Dependencies and fix any issues caused by the Upgrade. As noted in the point above, "Any deployment to UAT and PROD can... "
There are so many DevSecOps articles that have covered SECURITY quite well.
Even so, the following points must be repeated "ad nauseum."
- NO! – DevOps does not mean you can pull in any Docker Image and use it.
- Only approved externally sourced Docker images/Container-images that are already stored in a “secured” Container-Image-Repo (example: a Private Cloud/AWS ECR Repo) must be used.
- No exceptions.
- This must be enforced operationally or via automation!
Bringing DevOps into your enterprise is "Organizational Change."
99% of attempts at Organizational-Change get stuck, if not fail!
So...good luck!
Opinions expressed by DZone contributors are their own.
Comments