DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Trending

  • Which Is Better for IoT: Azure RTOS or FreeRTOS?
  • Testing, Monitoring, and Data Observability: What’s the Difference?
  • IntelliJ IDEA Switches to JetBrains YouTrack
  • How to Optimize CPU Performance Through Isolation and System Tuning

Installing and Running Presto

Learn how to configure and run Presto, an open-source distributed SQL query engine that helps with running interactive analytic queries.

Pallavi Singh user avatar by
Pallavi Singh
·
May. 16, 17 · Tutorial
Like (4)
Save
Tweet
Share
19.30K Views

Join the DZone community and get the full member experience.

Join For Free

In my previous blog, I talked about getting introduced to Presto. In today's blog, I'll be talking about install and running Presto.

The basic prerequisites for setting up Presto are:

  • Linux or Mac OS X.
  • Java 8, 64-bit.
  • Python 2.4+.

Installation

  1. Download the Presto Tarball from here.
  2. Unpack the Tarball.
    1. After unpacking, you will see a directory presto-server-0.175, which we will call the installation directory.

Configuring

Inside the installation directory, create a directory called etc. This directory will hold the following configurations:

  1. Node properties: Environmental configuration specific to each node.
  2. JVM config: Command line options for the Java Virtual Machine.
  3. Config properties: Configuration for the Presto server.
  4. Catalog properties: Configuration for connectors (data sources).
  5. Log properties: Configuring the log levels.

Now, we will setup the above properties one by one.

1. Setting Up Node Properties

Create a file called node.properties inside the etc folder. This file will contain the configuration specific to each node. Given below is a description of the properties we need to set in this file.

  • node.environment: The name of the Presto environment. All the nodes in the cluster must have an identical environment name.
  • node.id: The unique identifier for every node.
  • node.data-dir: The path of the data directory.

Note: Presto will store the logs and other data at the location specified in node.data-dir. It is recommended to create a data directory external to the installation directory, as this allows easy preservation during the upgrade.

You can put the following default content:

node.environment=production
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
node.data-dir=/var/presto/data

2. Setting Up JVM Config

Create a file named jvm.config inside the etc folder. In the file, we will specify all the options we need to configure for the launching of the JVM.

You can put the following default content:

-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError

Note: Please keep in mind that the format of the file must be a single line per option.

3. Setting Up Config Properties

Create a file named config.properties in the etc folder. This file contains the configuration related to the server. Presto servers can double up as worker and coordinator simultaneously. Before setting up the config file, let's discuss the properties in brief:

  • coordinator: If set as true, it sets the node as coordinator to accept queries from clients and manage query execution. In the case of only worker nodes, this value is set to false.
  • node-scheduler.include-coordinator: Enables scheduling on the  coordinator. Can be set to true/false.
  • http-server.http.port: Specifies the port to start the Presto server.
  • query.max-memory: Specifies the maximum limit for the memory that the query will be allowed.
  • query.max-memory-per-node: Specifies the maximum limit for the memory that the query will be allowed on the single node.
  • discovery-server.enabled: Can be set to true/false. It is used to find all nodes in the cluster. If false, the coordinator will run the embedded version of the discovery service.
  • discovery.uri: URI to the discovery server.
  • query.queue-config-file: File configuration to read from in queue configurations.

Now, let's set the properties in config.properties.

If the node is a coordinator, you can use the following as default content:

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080

If the node is a worker, you can use the following as default content:

coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery.uri=http://example.net:8080

For a single node doubling up as worker and coordinator, we can use below configuration as default content:

coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080

4. Setting Up Log Level

Create a file called log.properties in the etc folder. It will be used to set the minimum log level. The only property you need to set in this file is com.facebook.presto=INFO.

This property can have the following values: DEBUG, INFO, WARN, and ERROR.

5. Setting Up the Catalog

Presto accesses the data via connectors that are specified by means of catalogs. Catalogs are registered by creating a catalog property file for each connector. Create a directory called catalog in etc. Inside the etc/catalog directory, create a catalog. For instance, create a catalog for JMX.

Create jmx.properties in etc/catalog/ and set the name of the connector like connector.name=jmx.

Once you have completed these steps, we can begin with running Presto.

Running Presto

Inside the Presto installation directory, we have a launcher script. Now, Presto can be run in either the daemon or as a foreground process. The main difference between the two is that in the foreground mode, the server is started with logs and output is redirected to stdout/sterr.

To run as a daemon, use bin/launcher start. To run in the foreground, use bin/launcher start.

Once you run the above commands, you will be able to see the presto server running on the localhost:8080 (default port) or <localhost:Port>.

Screenshot from 2017-05-15 16-38-22That's all you need to do to start running Presto! In my next blog, I will discuss how to use the Presto CLI and set up the Presto server programmatically for applications.

Presto (SQL query engine) Property (programming) Directory

Published at DZone with permission of Pallavi Singh, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Which Is Better for IoT: Azure RTOS or FreeRTOS?
  • Testing, Monitoring, and Data Observability: What’s the Difference?
  • IntelliJ IDEA Switches to JetBrains YouTrack
  • How to Optimize CPU Performance Through Isolation and System Tuning

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: