DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Solving Four Kubernetes Networking Challenges
  • Why Use LocalPV with NVMe for Your Workload?
  • Dynatrace Perform: Day Two
  • Kubernetes Services Explained

Trending

  • Secure by Design: Modernizing Authentication With Centralized Access and Adaptive Signals
  • Endpoint Security Controls: Designing a Secure Endpoint Architecture, Part 1
  • Kullback–Leibler Divergence: Theory, Applications, and Implications
  • A Complete Guide to Modern AI Developer Tools
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Deep Dive Into Spark Cluster Management

Deep Dive Into Spark Cluster Management

Learn about the different cluster management modes that you can run in your Spark application - standalone, Mesos, Yarn, and Kubernetes - and how to manage them.

By 
Sangeeta Gulia user avatar
Sangeeta Gulia
·
Oct. 05, 17 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
11.1K Views

Join the DZone community and get the full member experience.

Join For Free

This blog aims to dig into the different cluster management modes in which you can run your Spark application.

Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object in your main program, which is called the Driver Program. Specifically, to run on a cluster, SparkContext can connect to several types of Cluster Managers, which allocate resources across applications. Once the connection is established, Spark acquires executors on the nodes in the cluster to run its processes, does some computations, and stores data for your application. Next, it sends your application code (defined by JAR or Python files passed to SparkContext) to the executors. Finally, SparkContext sends tasks to the executors to run.

spark1.jpeg

Spark offers these types of cluster managers:

  1. Standalone

  2. Mesos

  3. Yarn

  4. Kubernetes (experimental)

In addition to the above, there is experimental support for Kubernetes. Kubernetes is an open-source platform for providing container-centric infrastructure.

Standalone Mode

It is the easiest of all in terms of setup and provides almost all the same features as the other cluster managers if you are only running Spark. If you would like to run Spark along with other applications or would like to use richer resource scheduling capabilities (i.e. queues, etc.), both YARN and Mesos provide these features. Of these two, YARN is most likely to be preinstalled in many of the Hadoop distributions.

The Spark standalone mode requires each application to run an executor on every node in the cluster, whereas with YARN, you can configure the number of executors for the Spark application.

Mesos Mode

Mesos consists of a master daemon that manages the agent daemons that are running on each cluster node. Mesos framework is responsible for running the tasks on these agents. The master enables fine-grained sharing of resources (CPU, RAM, …) across frameworks by giving them resource offers. The master decides how many resources to offer to each framework according to a given organizational policy, such as fair sharing or strict priority.

spark2.jpeg

A framework running on top of Mesos consists of two components:

  1. A scheduler that registers with the master to be offered resources

  2. An executor process that is launched on agent nodes to run the framework’s tasks

Agent nodes report to master about free resources available to them. The master determines how many resources are offered to each framework and the frameworks’ schedulers select which of the offered resources to be utilized. When a framework accepts the offered resources, it passes a description of the tasks it wants to run on them to Mesos. In turn, Mesos launches the tasks on the corresponding agents.

spark3.jpeg.png

Yarn Mode

Spark with Yarn can be deployed in two modes.

Cluster Deployment Mode

In this mode, SparkDriver runs in the Application Master on Cluster host.

A single process in a YARN container is responsible for both driving the application and requesting resources from YARN.

The client that launches the application does not need to run for the lifetime of the application.

Cluster mode is not well-suited for using Spark interactively. Spark applications that require user input, such as spark-shell and pyspark, require the Spark driver to run inside the client process that initiates the Spark application.

spark4.jpeg

Client Deployment Mode

In client mode, the Spark driver runs on the host where the job is submitted.

The ApplicationMaster is responsible only for requesting executor containers from YARN. After the containers start, the client communicates with the containers to schedule work.

It supports spark-shell, as the driver runs at the client side.

spark5.jpeg

After going through the details about the three Cluster Managers, now we should understand the use case scenario where one should use which cluster management mode.

To understand that, let's look at their design priorities and how they approach scheduling work.

Standalone is good for small Spark clusters, but it is not good for bigger clusters (there is an overhead of running Spark daemons — master + slave — in cluster nodes). These daemons require dedicated resources. So standalone is not recommended for bigger production clusters.

YARN was created out of the necessity to scale Hadoop. Prior to YARN, resource management was embedded in Hadoop MapReduce V1 and it had to be removed in order to help MapReduce scale. The MapReduce 1 JobTracker wouldn’t practically scale beyond a couple thousand machines. The creation of YARN was essential to the next iteration of Hadoop’s lifecycle, primarily around scaling, whereas Mesos was built to be a scalable global resource manager for the entire data center.

Mesos determines which resources are available, and it makes offers back to an application scheduler (the application scheduler and its executor is called a “framework”). Those offers can be accepted or rejected by the framework. This model is considered a non-monolithic model because it is a “two-level” scheduler where scheduling algorithms are pluggable. Whereas when a job request comes into the YARN resource manager, YARN evaluates all the resources available and it places the job. It’s the one making the decision of where jobs should go; thus, it is modeled in a monolithic way.

When you evaluate how to manage your data center as a whole, you’ve got Mesos on one side that can manage all the resources in your data center, and on the other, you have YARN, which can safely manage Hadoop jobs but is not capable of managing your entire data center.

The language used to develop Apache Mesos is C++ because it is good for time-sensitive work, whereas Yarn is written in Java.

Kubernetes cluster AI application

Published at DZone with permission of Sangeeta Gulia, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Solving Four Kubernetes Networking Challenges
  • Why Use LocalPV with NVMe for Your Workload?
  • Dynatrace Perform: Day Two
  • Kubernetes Services Explained

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!