DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Simplify Your Compliance With Google Cloud Assured Workloads
  • Compliance Automated Standard Solution (COMPASS), Part 7: Compliance-to-Policy for IT Operation Policies Using Auditree
  • Enhance Terraform Final Plan Output in GitHub Actions
  • How To Build a Simple GitHub Action To Deploy a Django Application to the Cloud

Trending

  • Traditional Testing and RAGAS: A Hybrid Strategy for Evaluating AI Chatbots
  • GitHub Copilot's New AI Coding Agent Saves Developers Time – And Requires Their Oversight
  • Monolith: The Good, The Bad and The Ugly
  • AI Agents: A New Era for Integration Professionals
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Securely Authenticate to Google Cloud From GitHub

Securely Authenticate to Google Cloud From GitHub

Learn how to authenticate to GCP and a suite of tools from GitHub.

By 
Nicolas Fränkel user avatar
Nicolas Fränkel
DZone Core CORE ·
Updated May. 10, 22 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
3.5K Views

Join the DZone community and get the full member experience.

Join For Free

Recently, I designed a simple metrics-tracking system. A Python script queries different providers' APIs for metrics, e.g., Twitter, GitHub, etc. The idea is to run this script each day, store them in Google BigQuery, and provide an excellent data visualization in Google Data Studio. I'm a big fan of automation, so I'm using GitHub Actions.

Accessing Google Cloud With a Service Account

I query the different APIs with different Python libraries. All of them allow authenticating by passing a parameter. In general, it's a token. One can store the value in a GitHub secret, get it as an environment variable in the GitHub Action and use it in the code.

The Google BigQuery API doesn't work like this, though. The documentation states that the GOOGLE_APPLICATION_CREDENTIALS environment variable should point to a file on the file system. Hence, you need to get to:

  1. Create a Service Account
  2. Download a JSON credentials file that references it
  3. Store this file in a somehow secure way

The third point is indeed a big issue.

I considered a couple of alternatives:

  • YOLO: Store the file along the repo, it's a private repository anyway.

  • Hack the library: The Google Analytics Python library offers a function to pass the file's content itself. You can store the content in an environment variable; thus, keep the data in a GitHub secret.

    But I didn't find anything similar in the Google BigQuery library. If any Google developer reads this, please note that it's not a good developer experience. In the same language stack, I'd expect all libraries to be consistent regarding cross-cutting concerns, i.e., authentication.

    The solution would have been to hack the library to offer the same functionality as the Analytics one.

I discarded the first option for obvious reasons; the second is because my Python skills are close to zero.

Authenticate to Google Cloud

Securely authenticating to Google Cloud, or any other Cloud provider, from GitHub is a widespread concern. To manage this issue, GitHub provides the OpenID Connect flow:

GitHub Actions workflows are often designed to access a cloud provider (such as AWS, Azure, GCP, or HashiCorp Vault) in order to deploy software or use the cloud's services. Before the workflow can access these resources, it will supply credentials, such as a password or token, to the cloud provider. These credentials are usually stored as a secret in GitHub, and the workflow presents this secret to the cloud provider every time it runs.

However, using hardcoded secrets requires you to create credentials in the cloud provider and then duplicate them in GitHub as a secret.


With OpenID Connect (OIDC), you can take a different approach by configuring your workflow to request a short-lived access token directly from the cloud provider. Your cloud provider also needs to support OIDC on their end, and you must configure a trust relationship that controls which workflows are able to request the access tokens. Providers that currently support OIDC include Amazon Web Services, Azure, Google Cloud Platform, and HashiCorp Vault, among others.

-- About security hardening with OpenID Connect

The overall flow (simplified), is the following:

In short, during the Run phase:

  1. The GitHub repo gets a short-lived token from Google Cloud.
  2. Workflows on the same repo can call secured APIs on Google Cloud via Google libraries because the latter knows about the token.

The exact details are pretty involved; they go well beyond this entry-level blog post, and to be honest, I didn't read much deeper. However, GitHub and Google offer a GitHub Action to take care of these details.

Authenticate to Google Cloud is the name of the GitHub Action that allows handling the authentication part. The action offers two modes: a JSON credentials file (again!) and Workload Identity Federation. WIF provides the server part to make Github's OIDC calls work out-of-the-box.

Once you've set up Google Cloud, the GitHub workflow configuration is straightforward:

YAML
 
jobs:
  metrics:
    runs-on: ubuntu-latest
    permissions:
      contents: 'read'
      id-token: 'write'
    steps:
      - uses: actions/checkout@v3                                         # 1
      - uses: actions/setup-python@v3                                     # 2
        with:
          python-version: 3.9.10
      - uses: 'google-github-actions/auth@v0'                             # 3
        with:
          service_account: '${SERVICE_ACCOUNT_EMAIL}'
          workload_identity_provider: 'projects/${PROJECT_ID}/locations/global/workloadIdentityPools/${WI_POOL_NAME}/providers/${WI_PROVIDER_NAME}'
      - run: 'python main.py'                                             # 4
  1. Check out the repo
  2. Set up the Python environment
  3. Authenticate to get the token. The Action writes the token in a Google-specific location, scratched along the work environment when the workflow finishes.
  4. Enjoy your automatically-authenticated calls!

The GitHub Action documentation contains all information to set up Google Cloud for Workload Identity Federation.

Conclusion

In this post, I've highlighted the problem of using a Service Account file to authenticate on Google Cloud. Fortunately, GitHub and Google Cloud have all the infrastructure available to get short-lived tokens securely. In particular, the GitHub marketplace offers the Authenticate to Google Cloud Action. Its documentation is of really excellent quality. I'd advise you to use it instead of less secured alternatives.

To go further:

  • BigQuery API Client Libraries
  • Authenticate to Google Cloud GitHub Action
  • Workload identity federation
  • About security hardening with OpenID Connect

Originally published at A Java Geek on May 1st, 2022

GitHub Cloud Google (verb)

Published at DZone with permission of Nicolas Fränkel, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Simplify Your Compliance With Google Cloud Assured Workloads
  • Compliance Automated Standard Solution (COMPASS), Part 7: Compliance-to-Policy for IT Operation Policies Using Auditree
  • Enhance Terraform Final Plan Output in GitHub Actions
  • How To Build a Simple GitHub Action To Deploy a Django Application to the Cloud

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!