DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Designing Fault-Tolerant Messaging Workflows Using State Machine Architecture
  • Dynamic Forms With Camunda and Spring StateMachine
  • Structured Logging in Spring Boot 3.4 for Improved Logs
  • Symbolic and Connectionist Approaches: A Journey Between Logic, Cognition, and the Future Challenges of AI

Trending

  • Bridging UI, DevOps, and AI: A Full-Stack Engineer’s Approach to Resilient Systems
  • What’s Got Me Interested in OpenTelemetry—And Pursuing Certification
  • Developers Beware: Slopsquatting and Vibe Coding Can Increase Risk of AI-Powered Attacks
  • How To Build Resilient Microservices Using Circuit Breakers and Retries: A Developer’s Guide To Surviving
  1. DZone
  2. Software Design and Architecture
  3. Security
  4. Securing Your Machine Identities Means Better Secrets Management

Securing Your Machine Identities Means Better Secrets Management

Machine identities make up the majority of the over 12.7 million secrets discovered in public on GitHub in 2023. Let's look at how we got here and how we fix this.

By 
Dwayne McDaniel user avatar
Dwayne McDaniel
·
Jul. 09, 24 · Analysis
Likes (1)
Comment
Save
Tweet
Share
3.2K Views

Join the DZone community and get the full member experience.

Join For Free

In 2024, GitGuardian released the State of Secrets Sprawl report. The findings speak for themselves; with over 12.7 million secrets detected in GitHub public repos, it is clear that hard-coded plaintext credentials are a serious problem. Worse yet, it is a growing problem, year over year, with 10 million found the previous year and 6 million found the year before that. These are not cumulative findings!

When we dig a little deeper into these numbers, one overwhelming fact springs out: specific secrets detected, the vast majority of which are API keys, outnumber generic secrets detected in our findings by a significant margin. This makes sense when you realize that API keys are used to authenticate specific services, devices, and workloads within our applications and pipelines to enable machine-to-machine communication. This is very much in line with research from CyberArk, machine identities outnumber human identities by a factor of 45 to one. This gap is only going to widen continually as we integrate more and more services in our codebases and with ever-increasing velocity.

Specific vs Generic Secrets bar graph

Specific vs Generic secrets in the State of Secrets Sprawl Report 2024 findings

Secrets sprawl is clearly a problem for both human and machine identities, so why should we call out this distinction?

Machine Identities

"Machine identities" is a term used to distinguish this area of secrets sprawl and its unique challenges apart from human identities and credentials. Each is problematic, but each calls for different approaches. We are following the naming convention from industry leaders in secrets management, such as CyberArk, and analyst firms who define the industry, such as Gartner, in standardizing this terminology. Gartner defines the term in their 2020 IAM Technologies Hype Cycle report as, "Simply put, a machine identity is a credential used by any endpoint (which could be an IoT device, a server, a container, or even a laptop) to establish its legitimacy on a network." This term covers all API access keys, certificates, Public key infrastructure (PKI), and any other way possible to authenticate machine-to-machine communication. 

Is a Machine Identity the Same as a Non-Human Identity?

From a purely grammatical perspective, it must be a non-human identity if it is not a human identity. So why use the specific term machine identity? Well, practically speaking, a non-human could be a dog, a plant, or even a planet. When using the term "non-human" we must also necessarily further qualify what we mean, while the term 'machine identity' already has a widely accepted definition that narrows the scope to the secrets sprawl problem space.

For example, Venafi, a leading machine identity management platform, succinctly states, "The phrase “machine” often evokes images of a physical server or a tangible, robot-like device, but in the world of machine identity management, a machine can be anything that requires an identity to connect or communicate—from a physical device to a piece of code or even an API." 

How Did We Get Here?

Before we can talk about what to do about the issues of machine identities and secrets sprawl, it might be helpful to take a historical look at how we arrived at this point in the industry. In the early days of computer science, the only "entities'"we had to worry about accessing our machines and our code were humans. In the days of ENIAC or early UNIX systems, using a simple password and perhaps sturdy locks on the doors were really all you needed to ensure only the proper people could access a system. People love passwords, and we have for thousands of years. The Roman garrison used "watchwords," which needed to be updated nightly, meaning we have been practicing manual password rotation for a couple of millennia now. 

So, naturally, when it came time to implement machine-to-machine authentication, ensuring that we were only allowing access to trusted systems to recognize and communicate with one another, it was only natural we would turn to our old friend the password, in the form of a long and hard to guess token to get the job done. This system works okay until you remember the problem statement that started this article: we keep leaking these credentials into our code and into places around our code like Jira, Slack, and Confluence at an alarming rate.

Solving Both Human Identity and Machine Identity Sprawl

Now that we have a common vocabulary and understand the two areas of concern, human and machine, what are our next steps? Let's start with human identities. People need to be able to authenticate to gain access to systems to get their work done. Using phishing-resistant MFA, preferably hardware-based, at every juncture where a human uses a password is a solid approach. Even if a password is leaked, it is much harder to exploit and gives the user time to rotate the credential. While not a silver bullet, Microsoft believes this could stop up to 99.9% of fraudulent sign-ins. Even better, if there is a way to eliminate that password, such as with a passkey using FIDO2 or hardware-based biometrics for authentication, then we should probably move in that direction. 

Dealing with machine resources requires a different approach, as we can't just turn on MFA for machines. We also can't disrupt these machine identities, as the business of the enterprise is to do business, and the connections must continue to allow our systems to function and satisfy the availability leg of the CIA Triad. Similarly, we can not devote endless resources and hours to this issue, as new vulnerabilities in the form of CVEs, misconfiguration, and licensing issues continue to be other areas security teams need to tackle.

CIA triad

In an ideal world, we could immediately move all of our systems to leverage short-lived certificates or JWTs that are issued at run time when needed and only live for the life of the request. Indeed, there are frameworks such as SPIFFE and its implementation, SPIRE, that can help organizations achieve this goal. While this is indeed a great approach, it has the real-world issues of developer adoption, development time and effort, and the overhead of running such services at scale. 

While we can dream up many such ideal scenarios, we need to address the current situation head-on. Developers will continue to use machine identities, which can be leaked and exploited by attackers. At the same time, we know that if a malicious actor gets their hands on a secret, they can only leverage it if it is still valid. We believe the best practical solution for any organization is to rotate secrets much more frequently.

Automatically Rotating Secrets More Frequently

One of the other stand-out findings from our State of Secrets Sprawl Report was the fact that of all the valid secrets we discovered in public, over 90% were still valid five days later. We believe this points to the fact that teams expect secrets to be long-lived and that the current manual approach to secrets rotation is hard. Further evidence of these conclusions can be found in breach reports involving companies such as Cloudflare. 

In this Secret Management Maturity Model white paper, a clear differentiator in organizations in the Advanced and Expert categories is that they have adopted regular credential rotation policies. It is very unlikely these mature organizations are doing manual rotation, as that would be an overwhelming, time-consuming, and error-prone process, which potentially could mean disaster in our interconnected architectures.

We need a way to automate the rotation process. The good news is that awesome tools are available, such as CyberArk's Conjure or AWS Secrets Manager, that make the process of auto-rotation pretty straightforward. Of course, this assumes all of your machine identities already and totally live within their system. 

Auto-Rotation of Secrets First Means Knowing All Your Machine Identities

Now, we could ask for every developer and infrastructure owner to give security teams a list of all their credentials in plaintext for all their various workloads, services, and devices, but obviously, that is a terrible and highly problematic idea.

In all seriousness, what is needed is a scalable end-to-end solution that can help you systematically and automatically find all the plaintext credentials inside of your code base, leaked out onto GitHub publicly, or even found in the communication tools that surround your code.

Look for solutions that:

  • Gather all the data about a secret sprawl incident into a single logical unit
  • Are reachable by an API call or webhook, making it possible to interoperate with other systems
  • Can handle any volume of files to scan and can scan in multiple systems, both historically and in real-time
  • Offer developer tooling that helps prevent the issue in the first place

With such a tool in hand, you can find and then implement auto-rotation solutions.

Final Thought

No matter how you tackle the Machine Identity crisis in your organization, make sure you start sooner rather than later, as you will never have as few secrets in your environments as you do right this moment.

Machine Information security management Password manager

Published at DZone with permission of Dwayne McDaniel. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Designing Fault-Tolerant Messaging Workflows Using State Machine Architecture
  • Dynamic Forms With Camunda and Spring StateMachine
  • Structured Logging in Spring Boot 3.4 for Improved Logs
  • Symbolic and Connectionist Approaches: A Journey Between Logic, Cognition, and the Future Challenges of AI

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!