DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Data Security Considerations in Cloud Data Warehouses
  • The Data Access Layer in Jmix: JPA on Steroids
  • Enhancing Avro With Semantic Metadata Using Logical Types
  • Constructing Secure and Strong Data Access Layers With JPA

Trending

  • Zero Trust for AWS NLBs: Why It Matters and How to Do It
  • How to Convert XLS to XLSX in Java
  • Why Documentation Matters More Than You Think
  • Optimize Deployment Pipelines for Speed, Security and Seamless Automation
  1. DZone
  2. Data Engineering
  3. Databases
  4. Controlling Access to Google BigQuery Data

Controlling Access to Google BigQuery Data

Secure your BigQuery data with IAM roles, authorized views, authorized datasets, authorized routines, and authorized materialized views.

By 
Karteek Kotamsetty user avatar
Karteek Kotamsetty
·
Feb. 21, 25 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
14.2K Views

Join the DZone community and get the full member experience.

Join For Free

Google BigQuery, Google Cloud's data warehouse, provides a comprehensive suite of tools to help you control who can access your valuable data and what they can do with it. This blog post dives into the essential principles and practical techniques for managing data access in BigQuery and covers everything from basic Identity and Access Management (IAM) to more advanced features like authorized datasets, views, routines, and materialized views.

We'll guide you through setting up granular permissions, ensuring your data remains secure and accessible only to authorized individuals and services. This guide will equip you with the knowledge you need to take control of your BigQuery data.

Controlling Access to Your BigQuery Data

Data access management is crucial for ensuring your data's security and integrity. It's the process of defining, enforcing, and monitoring the rules and policies that determine who can access what data. BigQuery offers a set of features that will help you manage data access effectively:

Identity and Access Management (IAM)

Identity and Access Management (IAM) is the cornerstone of BigQuery's access management system. It allows you to control access to your BigQuery resources, such as projects, datasets, tables, and views, by assigning roles to users, groups, and service accounts. These roles define the specific actions each entity can perform. 

IAM permissions are hierarchical; roles granted at a higher level, like the project level, are inherited by resources at lower levels, like datasets, tables, and views, unless explicitly overridden at a lower level. This means a role granted at the project level gives access to all datasets and tables within that project unless a more restrictive role is applied at the dataset or table level.

It's best to use the predefined roles with the least necessary privileges. For example, you can grant a user the role of "BigQuery Data Viewer" to allow them to read data. The "BigQuery Job User" role allows users to run queries and see the results of those queries, but they cannot directly browse table data. 

However, keep in mind that users with the "Job User" role can still access data through queries, so be careful when granting this role if you have highly sensitive data. 

Following the principle of least privilege is crucial. Always grant the minimum necessary permissions to users and service accounts to ensure the security of your data. Roles like "BigQuery Data Owner" and "BigQuery Admin" provide full control over all resources and should be granted only to the administrators who are directly responsible for managing the data and who absolutely require such broad access.

To grant the BigQuery Data Viewer role at the individual table level, navigate to the desired table in the BigQuery console. Then, click on the option "Share" and then click on "Manage Permissions."

click on 'Share' and then on 'Manage Permissions'

To grant access, select the user, group, or service account and select the role "BigQuery Data Viewer." Then, click on "Save."

Select the BigQuet Data Viewer role then hit Save

Controlling access to the data could also be achieved using authorization. 

Controlling Access With Authorized Datasets, Views, Materialized Views, and Routines

BigQuery's access control system relies heavily on authorized datasets, views, and authorized routines and materialized views.

Authorized Views

A BigQuery view is essentially a stored query or a virtual table defined by a SQL statement. An authorized view allows you to give users access to the results of a query without giving them access to the underlying tables from which the query pulls. This is incredibly powerful for data security.  

Imagine you have a table with sensitive customer data, but you want analysts to be able to see aggregated booking data. You can create a view and then authorize that view to access the customer data. Analysts can then query the view and get the aggregated data, but they never see the raw customer data.  This granular control is key to protecting sensitive information.

Steps to Creating Authorized View

In the BigQuery Studio, run the query below and click on "Go to View."

SQL
 
CREATE OR REPLACE VIEW `dzone-tutorial.dzone_dataset.bookings_view` AS

SELECT

    booking_id,

    first_name,

    last_name,

    trip_from,

    trip_to,

    cost

FROM

    `dzone-tutorial.dzone_dataset.bookings`;

Click Go to View


Click on "Query."

Click on Query

Run the query below and go to the results to showcase the data in the view created.

Go to Results

Click on the "dzone_dataset," click on "Sharing," and then click on "Authorize Views."

Click on Authorized Views

Type "bookings_view," select the view created, and click on "Add Authorization." You have granted "bookings_view" permission to access the data in the dataset.

Add Authorization

Authorized Datasets

If you have many views that all need access to the same underlying dataset, managing individual authorized views can become tedious. This is where authorized datasets come in. Instead of authorizing each view one by one, you can authorize the entire dataset that contains those views. This means that any view within that authorized dataset automatically inherits the permission to access the shared data. 

This dramatically simplifies access management, especially when dealing with a large number of views.  It's like creating a "data access hub" for your views.

Steps to Creating Authorized Datasets

In the BigQuery Studio, select a dataset you want to create and authorize, and click on "Authorize Datasets."

Authorize Datasets

Start typing the Dataset name; in this case, it is "dzone_dataset," and click on "Add Authorization."

Add Authorization

To remove the authorization, select the dataset you just authorized and click on "Remove Authorization."

Remove Authorization

Authorized Routines

BigQuery routines allow you to encapsulate complex logic for stored procedures and functions. Authorized routines add a layer of security by letting you share the results of a routine without disclosing the underlying query or data.

Think of it like a black box: the user can call the routine and get the output, but they don't see what's happening inside. This is particularly useful for scenarios like data aggregation, where you want to provide summary statistics without revealing the individual data points. You could have a routine that calculates the average booking value, and users could call this routine without ever seeing the individual booking details.

Steps to Creating Authorized Routines

In the BigQuery Studio, run the query.

SQL
 
CREATE OR REPLACE PROCEDURE `dzone-tutorial.dzone_dataset.get_total_bookings_cost`(

    IN start_date DATE,

    IN end_date DATE

)

BEGIN

    SELECT SUM(cost)

    FROM `dzone-tutorial.dzone_dataset.bookings`  

    WHERE date_of_birth BETWEEN start_date AND end_date; 

END;

Run this query

Click on "dzone_dataset" and click on "Sharing." Then, click on "Authorized Routines."

Authorize Routines

Select the dataset "dzone_dataset," select the "Routine get_total_bookings_cost," and click on "Add Authorization."

Add Authorization

Authorized Materialized Views

Materialized views are pre-computed views.  They store the results of a query, which significantly improves query performance.  Authorized materialized views combine the performance benefits of materialized views with the security features of authorized views.  They allow you to share pre-calculated data without granting access to the source data.  This is ideal for dashboards or reports that need to be generated quickly and securely.

Steps to Creating Authorized Materialized View

In the BigQuery Studio, run the query.

SQL
 
CREATE OR REPLACE MATERIALIZED VIEW `dzone-tutorial.dzone_dataset.bookings_summary` AS

SELECT

    trip_from,

    trip_to,

    COUNT(*) AS total_bookings,

    AVG(cost) AS average_cost

FROM

    `dzone-tutorial.dzone_dataset.bookings`  

WHERE date_of_birth BETWEEN '1985-01-01' AND '2023-12-31'

GROUP BY

    trip_from,

    trip_to;

Run this query

Summary

Managing access to your BigQuery data is a critical responsibility. Google BigQuery's IAM, combined with authorized datasets, views, routines, and materialized views, provides the tools you need to build a robust security framework for your data. 

We covered the basics of IAM roles and permissions, emphasizing the principle of least privilege. We then explored how authorized views allow you to share query results without exposing underlying data, and how authorized datasets streamline access management for multiple views. Furthermore, we examined how authorized routines enable secure sharing of aggregated or calculated data, and how authorized materialized views provide both performance and security benefits.  

By understanding and implementing these techniques, you can ensure that your BigQuery data is protected while remaining accessible to those who need it. Remember, data security is an ongoing process. Regularly review and update your access control policies to stay ahead of potential threats and maintain the integrity of your data.

Data access Data security Materialized view Google (verb) Database

Opinions expressed by DZone contributors are their own.

Related

  • Data Security Considerations in Cloud Data Warehouses
  • The Data Access Layer in Jmix: JPA on Steroids
  • Enhancing Avro With Semantic Metadata Using Logical Types
  • Constructing Secure and Strong Data Access Layers With JPA

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!