DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • From Compliance Pipes to Data Streams: Modernizing Healthcare EDI for Strategic Value
  • How We Rebuilt a Legacy HBase + Elasticsearch System Using Apache Iceberg, Spark, Trino, and Doris
  • Green AI in Practice: How I Track GPU Hours, Energy, CO₂, and Cost for Every ML Experiment
  • A Pattern for Intelligent Ticket Routing in ITSM

Trending

  • Skills, Java 17, and Theme Accents
  • Data Contracts as the "Circuit Breaker" for Model Reliability
  • Optimizing High-Volume REST APIs Using Redis Caching and Spring Boot (With Load Testing Code)
  • Building a DevOps-Ready Internal Developer Platform: A Hands-On Guide to Golden Paths, Self-Service, and Automated Delivery Pipelines
  1. DZone
  2. Data Engineering
  3. Big Data
  4. How to Disable the Download Button in SageMaker Studio

How to Disable the Download Button in SageMaker Studio

If you want to ensure that your data scientists' cloud environment is secure from data leaks, remove this feature from SageMaker Studio.

By 
Roger Oriol user avatar
Roger Oriol
·
Sep. 27, 22 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
3.1K Views

Join the DZone community and get the full member experience.

Join For Free

Many enterprises choose a cloud environment to power the work of their data science team. If you chose the AWS SageMaker Studio, this article might interest you. Having both the data lake and the data scientist environment makes it easy to integrate them. You can choose what data any given data scientist is able to see. You might want a data scientist only to be able to use this data inside the SageMaker Studio environment. However, SageMaker Studio has a download button that lets data scientists download any data they have been working on. Once they have downloaded data to their computers, they are free to share it anywhere and with anyone.

Luckily, it is possible to disable this download button. Recently, it was only possible to disable the download button in SageMaker Notebooks. This article from Ujjwal Bhardwaj shows how to disable it in SageMaker Notebooks.

But AWS updated SageMaker Studio and now it can also disable the download button. This update lets us configure Studio to use JupyterLab version 3. In this version, JupyterLab refactored some features, including the download button. Now, those features are plugins included by default by JupyterLab, instead of hardcoded in the JupyterLab core. This means that it is now possible to disable these plugins and they won't show up in the UI.

The plugins that include a download button in the JupyterLab UI are the following:

  • @jupyterlab/docmanager-extension:download
  • @jupyterlab/filebrowser-extension:download

There are a couple of ways to disable those plugins. The most straightforward is to run these commands in a SageMaker Studio terminal:

 
conda activate studio
jupyter labextension disable jupyterlab/docmanager-extension:download
jupyter labextension disable @jupyterlab/filebrowser-extension:download
restart-jupyter-server


You can also use the JupyterLab configuration files. Edit the file /opt/conda/envs/studio/etc/jupyter/labconfig/page_config.json with the following content:

 
{
  "disabledExtensions": {
    "@jupyterlab/docmanager-extension:download": true,
    "@jupyterlab/filebrowser-extension:download": true
  }
}


and run the command:

 
restart-jupyter-server


You might also have to refresh the page to see if the changes take place.

The problem with these approaches is that changes will only last for the duration of the session. To make the changes permanent, you have to create a Studio Lifecycle Configuration. The Lifecycle Configuration will execute a script when the JupyterServer starts. In this script, you will edit the file in the previous example.

The content of the script will be:

 
echo "{" > /opt/conda/envs/studio/etc/jupyter/labconfig/page_config.json
echo "  \\"disabledExtensions\\": {" >> /opt/conda/envs/studio/etc/jupyter/labconfig/page_config.json
echo "    \\"@jupyterlab/docmanager-extension:download\\": true," >> /opt/conda/envs/studio/etc/jupyter/labconfig/page_config.json
echo "    \\"@jupyterlab/filebrowser-extension:download\\": true" >> /opt/conda/envs/studio/etc/jupyter/labconfig/page_config.json
echo "  }" >> /opt/conda/envs/studio/etc/jupyter/labconfig/page_config.json
echo "}" >> /opt/conda/envs/studio/etc/jupyter/labconfig/page_config.json
restart-jupyter-server


There are many ways to create a Lifecycle configuration. You can do it via the Console, using a Cloudformation Stack, or via AWS CLI. Using the CLI you could do:

 
aws sagemaker create-studio-lifecycle-config \
--region <your-region> \
--studio-lifecycle-config-name my-studio-lcc \
--studio-lifecycle-config-content $LCC_CONTENT \
--studio-lifecycle-config-app-type JupyterServer 


$LCC_CONTENT is a string with the content of the script described before. Then, when you create a user profile in the SageMaker Domain, you can bind the Lifecycle Configuration to it:

 
aws sagemaker create-user-profile --domain-id <DOMAIN-ID> \ --user-profile-name <USER-PROFILE-NAME> \ --region <REGION> \ --user-settings '{ "JupyterServerAppSettings": {   "LifecycleConfigArns":     ["<LIFECYCLE-CONFIGURATION-ARN-LIST>"]   } }'


From now on, every time a data scientist opens their instance of SageMaker Studio, it should never display a download button. This effectively blocks them from downloading any files located in their Studio, as long as they are not able to revert these changes themselves from their terminal. Also, note that disabling the download plugin only removes all the download buttons from the interface. This does not mean that if there are other means of downloading files those are also blocked.

Data science Download

Published at DZone with permission of Roger Oriol. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • From Compliance Pipes to Data Streams: Modernizing Healthcare EDI for Strategic Value
  • How We Rebuilt a Legacy HBase + Elasticsearch System Using Apache Iceberg, Spark, Trino, and Doris
  • Green AI in Practice: How I Track GPU Hours, Energy, CO₂, and Cost for Every ML Experiment
  • A Pattern for Intelligent Ticket Routing in ITSM

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook