DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Change Data Captures CDC from MySQL Database to Kafka with Kafka Connect and Debezium
  • Useful System Table Queries in Relational Databases
  • How to Repair Corrupt MySQL Database Tables Step-by-Step
  • Harnessing the Power of AWS Aurora for Scalable and Reliable Databases

Trending

  • Go 1.24+ Native FIPS Support for Easier Compliance
  • The Role of AI in Identity and Access Management for Organizations
  • Developers Beware: Slopsquatting and Vibe Coding Can Increase Risk of AI-Powered Attacks
  • Exploring Intercooler.js: Simplify AJAX With HTML Attributes
  1. DZone
  2. Data Engineering
  3. Databases
  4. Tutorial: How to Define SQL Functions With Presto Across All Connectors

Tutorial: How to Define SQL Functions With Presto Across All Connectors

Learn how to implement your own SQL function with Presto, the open-source SQL query engine.

By 
Rohan Pednekar user avatar
Rohan Pednekar
·
Updated Nov. 09, 21 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
5.0K Views

Join the DZone community and get the full member experience.

Join For Free

Presto is the open-source SQL query engine for data lakes. It supports many native functions which are usually sufficient for most use cases. However, there is maybe a corner case where you need to implement your own function. To simplify this, Presto allows users to define expressions as SQL functions. These are dynamic functions separated from the Presto source code, managed by a functions namespace manager that you can set up with a MySQL database. In fact, this is one of the most widely used features of Presto at Facebook, with over 1000s of functions defined.

Function Namespace Manager

A function namespace is a special catalog.schema that stores functions in the format like mysql.test. Each catalog.schema can be a function namespace. A function namespace manager is a plugin that manages a set of these function catalog schemas. The catalog can be mapped to connectors in Presto (a connector for functions, no tables or view) and allows the Presto engine to perform actions such as creating, altering, and deleting functions.

This user-defined function management is separated from connector API for flexibility, hence these SQL functions can be used across all connectors. Further, the query is guaranteed to use the same version of the function throughout the execution and any modification to the functions is versioned. 

Implementation

Today, function namespace manager is implemented with the help of MySQL, so users need to have a running MySQL service to initialize the MySQL-based function namespace manager. 

Step 1: Provision MySQL Server and Generate JDBC URL for Further Access

Suppose the MySQL server can be reached at localhost:3306, for example database URL – 

jdbc:mysql://localhost:3306/presto?user=root&password=password

Step 2: Create Database and Tables in MySQL Database to Store Function Namespace Manager Related Data

 
 CREATE DATABASE presto;
 USE presto;


Step 3: Configure at Presto [2]

Create Function namespace manager configuration under etc/function-namespace/mysql.properties:

 
function-namespace-manager.name=mysql database-url=jdbc:mysql://localhost:3306/presto?user=root&password=password
function-namespaces-table-name=function_namespaces
functions-table-name=sql_functions


And restart the Presto Service.

Step 4: Create New Function Namescape

Now once the Presto server is started we will see below tables under presto database (which is being used to manage function namespace) in Mysql –

 
mysql> show tables;
+---------------------+
| Tables_in_presto    |
+---------------------+
| enum_types          |
| function_namespaces |
| sql_functions       |
+---------------------+
93 rows in set (0.00 sec)


To create a new function namespace ”ahana.default”, insert into the function_namespaces table:

 
INSERT INTO function_namespaces (catalog_name, schema_name)
    VALUES('ahana', 'default');


Step 5: Create a Function and Query From Presto [1]


SQL functions_blog

Here is simple example of SQL function for COSECANT: 

 
presto>CREATE OR REPLACE FUNCTION ahana.default.cosec(x double)
RETURNS double
COMMENT ‘Cosecant trigonometric function'
LANGUAGE SQL
DETERMINISTIC
RETURNS NULL ON NULL INPUT
RETURN 1 / sin(x);


More examples can be found at https://prestodb.io/docs/current/sql/create-function.html#examples [1]

Step 6: Apply the Newly Created Function and SQL Query


SQL functions_blog

It is required for users to use a fully qualified function name while using in SQL queries.

Following the the example of using cosec SQL function in the query. 

 
presto> select ahana.default.cosec (50) as Cosec_value;
     Cosec_value     
---------------------
 -3.8113408578721053 
(1 row)

Query 20211103_211533_00002_ajuyv, FINISHED, 1 node
Splits: 33 total, 33 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s


Here is another simple example of creating an EpochTimeToLocalDate function to convert Unix time to local timezone under ahana.default function namespace.

 
presto> CREATE FUNCTION ahana.default. EpochTimeToLocalDate (x bigint) 
     -> RETURNS timestamp 
     -> LANGUAGE SQL 
     -> DETERMINISTIC RETURNS NULL ON NULL INPUT 
     -> RETURN from_unixtime (x);
CREATE FUNCTION

presto> select ahana.default.EpochTimeToLocalDate(1629837828) as date;
     date      
-------------------------
 2021-08-24 13:43:48.000 
(1 row)

Query 20211101_230315_00043_ajuyv, FINISHED, 1 node
Splits: 33 total, 33 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]


Note

function-namespaces-table-name  <The name of the table that stores all the function namespaces managed by this manager> property can be used if there is a use case to instantiate multiple function namespace managers. Otherwise, if we can create functions in only one function namespace manager then it can be utilized across all different databases or connectors. [2]

Future Roadmap

Remote function Support with remote UDF thrift API 

Allows you to run arbitrary functions that are either not safe or not possible to run within worker JVM: unreliable Java functions, C++, Python, etc.

References

[1] DDL Syntax to use FUNCTIONS

[2] Function Namespace Manager Documentation

Presto (SQL query engine) MySQL Connector (mathematics) Database

Opinions expressed by DZone contributors are their own.

Related

  • Change Data Captures CDC from MySQL Database to Kafka with Kafka Connect and Debezium
  • Useful System Table Queries in Relational Databases
  • How to Repair Corrupt MySQL Database Tables Step-by-Step
  • Harnessing the Power of AWS Aurora for Scalable and Reliable Databases

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!