Automate Azure Databricks Unity Catalog Permissions at the Table Level
This article provides steps and scripts to apply permissions at the Unity Catalog table level, reducing manual effort related to permissions at table level.
Join the DZone community and get the full member experience.
Join For FreeDisclaimer: All the views and opinions expressed in the blog belong solely to the author and not necessarily to the author's employer or any other group or individual. This article is not a promotion for any cloud/data management platform. All the images and code snippets are publicly available on the Azure/Databricks website.
In my other DZone articles, I have discussed what Databricks Unity Catalog is, how the privilege model works in Unity Catalog, schema level, and script to automate permission management at both the catalog and schema levels.
In this article, I aim to provide the script that will automate the permission management at the Unity Catalog table level.
Privileges at the Unity Catalog Table Level
In Unity Catalog (Databricks), applying permissions at the table level is necessary when you want to control access to specific tables or views, rather than applying permissions to all objects within a schema. Table-level permissions are useful in the following scenarios:
1. Granular Access Control
When you need to grant or restrict access to specific tables or views without affecting the entire schema.
This is especially important when a schema contains sensitive data, and you want to allow access to some tables while limiting access to others.
2. Protecting Sensitive Data
If certain tables within a schema contain confidential or restricted data (e.g., personal identifiable information, financial data), you can apply table-level permissions to ensure that only authorized users can view or query these tables.
For example, you might allow access to some summary or aggregated data tables but restrict access to raw, detailed tables containing sensitive information.
3. Delegating Access to Specific Teams/Users If the Schema Contains Tables Intended for Multiple Teams
When different users or teams need access to different tables within the same schema. For example, the sales team might need access to customer-related tables, while the finance team needs access to revenue tables.
Applying permissions at the table level ensures that each team can access only the tables relevant to their work.
4. Compliance With Data Governance
When enforcing strict data governance policies, you might need to control access at a more granular level (down to individual tables). Table-level permissions help ensure compliance by granting access only to the data that a user or role is authorized to work with.
5. Handling Mixed Access Needs Within a Schema
In cases where a schema contains tables with varying levels of sensitivity or confidentiality, applying permissions at the schema level may be too broad. Table-level permissions allow you to manage access for each table individually based on specific needs.
Permissions That Can Be Applied at the Table Level
- SELECT: Grants read access to the table, allowing users to query it.
- MODIFY: Gives the ability to add, delete, and modify data to or from an object.
- APPLY TAG: Gives the ability to apply tags to an object.
- ALL PRIVILEGES: Gives all privileges.
Automation Script
Prerequisites
- Unity Catalog is already set up.
- Principal(s) is/are associated with the Databricks workspace.
- User running the permission script has proper permissions on the table(s), schema, and catalog.
Step 1: Create a Notebook and Declare the Variables
Create a notebook in Databricks workspace. To create a notebook in your workspace, click the "+" New in the sidebar, and then choose Notebook.
A blank notebook opens in the workspace. Make sure Python is selected as the notebook language.
Copy and paste the code snippet below into the notebook cell and run the cell.
catalog = 'main' # Specify your catalog name
schema = 'default' # Specify your schema name
tables_arr= 'test1,test2' # Specify the Comma(,) seperated values of table name
tables = tables_arr.split(',')
principals_arr = '' # Specify the Comma(,) seperated values for principals in the blank text section (e.g. groups, username)
principals = principals_arr.split(',')
privileges_arr = 'SELECT,APPLY TAG' # Specify the Comma(,) seperated values for priviledges in the blank text section (e.g. SELECT,APPLY TAG)
privileges = privileges_arr.split(',')
Step 2: Set the Catalog and the Schema
Copy, paste, and run the below code block in a new or in the existing cell and run the cell.
query = f"USE CATALOG `{catalog}`" #Sets the Catalog
spark.sql(query)
query = f"USE SCHEMA `{schema}`" #Sets the Schema
spark.sql(query)
Step 3: Loop Through the Principals and Privileges and Apply Grant at the Catalog, Schema, and Tables
Copy, paste, and run the code block below in a new or existing cell, then run the cell to apply the permissions.
for principal in principals:
query = f"GRANT USE_CATALOG ON CATALOG `{catalog}` TO `{principal}`" # Use catalog permission at Catalog level
spark.sql(query)
query = f"GRANT USE_SCHEMA ON SCHEMA `{schema}` TO `{principal}`" # Use schema permission at Schema level
spark.sql(query)
for table in tables:
for privilege in privileges:
query = f"GRANT `{privilege}` ON `{table}` TO `{principal}`" # Grant priviledges on the tables to the pricipal
print(query)
spark.sql(query)
Validation
You can validate the privileges by opening Databricks UI and navigating to "Catalog" in the Data Explorer. Once the catalog shows up in the Data section, click on the catalog, then expand the schema and select the table inside the schema where you have applied the permissions, and go to the "permissions" tab. You can now see all the privileges applied to the table. Below is the screen shot of permissions applied in the out-of-box catalog table inside the main catalog and information_schema
schema.
You can also run the below SQL script in a notebook to display all the permissions for a table as part of your validation.
SHOW GRANTS ON TABLE table_name;
Conclusion
Automating privilege management in Databricks Unity Catalog at the table level helps ensure consistent and efficient approach of applying permissions at the lowest level in the Unity Catalog. The code provided demonstrates a practical way to assign multiple table-level privileges for multiple principals and tables in a single catalog and schema. In a catalog where 100's of table are present and different permissions needs to be provided for different principals, the above automation significantly reduces manual error and effort.
Published at DZone with permission of Soumya Barman. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments