DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Docker and Kubernetes Transforming Modern Deployment
  • How To Use the Node Docker Official Image
  • Building a Flask Web Application With Docker: A Step-by-Step Guide
  • Using Environment Variable With Angular

Trending

  • Docker Model Runner: Streamlining AI Deployment for Developers
  • Kubeflow: Driving Scalable and Intelligent Machine Learning Systems
  • Revolutionizing Financial Monitoring: Building a Team Dashboard With OpenObserve
  • Building Enterprise-Ready Landing Zones: Beyond the Initial Setup
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Running Apache Superset in a Docker

Running Apache Superset in a Docker

As a response to questions from a previous article, this author explores how to set up Apache Superset as a Docker image.

By 
Abhishek Sharma user avatar
Abhishek Sharma
·
Updated Jan. 29, 19 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
34.0K Views

Join the DZone community and get the full member experience.

Join For Free

A couple of days back, I wrote the post about how to run Apache Superset in the production environment for serving hundreds or thousands of users. Superset community members and users appreciated the post for which I am thankful; however, over the Superset Slack and Gitter channels, many users asked various questions on setting Superset as a Docker container and how to use/run it. In this post, I am trying to explore more about the Docker image of a Superset, and I am hoping that after reading the post you will acquire a conceptual understanding of setting Superset as a Docker container and benefits of it.

Container? Image?

First, let’s quickly understand what exactly terms "container" and "image" mean and how they are related to Docker.

As per Wikipedia, any structure which holds product for storage, packaging, and shipping is a container. Same applies for the container in a software world.

A container is a standard unit of software that packages up the code and all its dependencies, so the application runs quickly and reliably from one computing environment to another.


Now, let’s look what a term image means.

A container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, run-time, system tools, system libraries, and settings.


Finally, relationship with Docker.

Container images become containers at runtime and in the case of Docker containers — images become containers when they run on Docker Engine. Available for both Linux and Windows-based applications, containerized software will always run the same, regardless of the infrastructure.

There are many other container runtime environments, but Docker among them is the most popular one.

Back to Superset Docker Image

There are multiple active repositories and images of Superset available over GitHub and DockerHub. Below is a list of some of them.

  • Apache Superset Docker repo
  • Popular repo
  • Another repo
  • And recently published by me

Why so many repositories? Are they different? Aren’t they suppose to be the same and provide the same functionality, i.e., packaging the Superset and its dependencies? Yes, they should be identical, but there are multiple different ways and modes to start the Superset. An image should be generic for handling all method and commands which is not the case, and that’s why there are multiple repositories.

I started working on Superset with the perspective of running it in a completely distributed manner so that hundreds or thousands of users can access the Superset concurrently. In the beginning, I was exploring the Apache Superset code but realized that several changes are required to run Superset multiple containers for a distributed architecture and that’s why I decided to have a separate repository.

Features of the Docker image of Superset

  • There are multiple ways to start the container, either by using the commanddocker-compose or by using the docker runcommand.
  • Superset all components, i.e., web application, celery worker, celery flower UI can run in the same container or different containers.
  • All database plugins and packages are installed by default.
  • Container first runs sets required Superset metadata database along with sample data and the Fab-manager user account with credentials username: admin & password: admin
  • Apart from the packaged Superset config file in a container image, a custom config file like superset_config.py can be mounted to the container. There's need to rebuild the image for changing configurations.
  • The default configuration uses MySQL as a Superset metadata database and Redis as a cache and celery broker which can be easily replaced.

Starting the container using the command  docker-compose  will start three containers. mysql5.7 as the metadata database, redis3.4 as a cache and celery broker, and Superset container.

  • Expect multiple environment variables defined in the docker-compose.yml file. Default environment variables are present in the file .env
  • Default environment variables can be overridden either by editing a .env file or passing through commands using environmentsSUPERSET_ENV variable.
  • Permissible value of SUPERSET_ENV can be either local or prod.
  • In modelocal one celery worker and Superset flask-based superset web application run.
  • In modeprod two celery workers and Gunicorn-based Superset web application run.

Starting the container by using commanddocker run can be a used for a complete distributed setup, requires metadata database & Redis URL for starting the container.

  • Single or multiple server(using load balancer) containers can be spawned. In the server, Gunicorn based superset web application runs.
  • Multiple celery workers container running on same or different machines. In worker, celery worker & flower UI runs.

How to Run

  • First, copy superset_config.py, docker-compose.yml, and .env files in your execution environment. Please follow the directory structure like below.

docker-superset
     |__config
     |    |__superset_config.py
     |
     |__docker-files
     |    |__docker-compose.yml
     |    |__.env   


  • Running a container using the command docker-compose

Starting a Superset image as a superset container in a local mode:

cd docker-superset/docker-files/ && docker-compose up -d

Starting a Superset image as a superset container in a prod mode:

cd docker-superset/docker-files/ && SUPERSET_ENV=prod SUPERSET_VERSION=<version-tag> docker-compose up -d


  • Running a container using the command docker run:

Starting a superset image as a server container:

cd docker-superset && docker run -p 8088:8088 -v config:/home/superset/config/ abhioncbr/docker-superset:<version-tag> cluster server <superset_metadata_db_url> <redis_url>

Starting a superset image as a worker container

cd docker-superset && docker run -p 5555:5555 -v config:/home/superset/config/ abhioncbr/docker-superset:<version-tag> cluster worker <superset_metadata_db_url> <redis_url>


  • Note: There is no need to build an image if you are not making changes in the image. You can pull an image from DockerHub using the below command which can be any superset-version or one with tag "latest".

docker pull abhioncbr/docker-superset:<version-tag>


Extending Superset Docker Image

  • No changes are required for adding new environment variables. For example, adding a BigQuery connection with Superset whichGOOGLE_APPLICATION_CREDENTIALS can be easily provided through a docker-compose.yml file or passing through command.
  • Also, changes done in the filesuperset_config.py are easily reflectable into the container by mounting the config file into the container.
  • For any further changes or bug, please contact me or contribute to the repository.

Happy Superset Exploration!!!

Docker (software) Web application Command (computing) Celery (software)

Opinions expressed by DZone contributors are their own.

Related

  • Docker and Kubernetes Transforming Modern Deployment
  • How To Use the Node Docker Official Image
  • Building a Flask Web Application With Docker: A Step-by-Step Guide
  • Using Environment Variable With Angular

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!