DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Non-blocking Database Migrations
  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • SQL Server to Postgres Database Migration
  • The Generic Way To Convert Between Java and PostgreSQL Enums

Trending

  • Ujorm3: A New Lightweight ORM for JavaBeans and Records
  • Multi-Scale Feature Learning in CNN and U-Net Architectures
  • End-to-End Event Streaming With Kafka, Spring Boot and AWS SQS/SNS (Production-Ready Code Guide)
  • RAG Done Right: When to Use SQL, Search, and Vector Retrieval and How To Combine Them
  1. DZone
  2. Data Engineering
  3. Databases
  4. Polymorphism in SQL Part One: Anyelement and Anyarray

Polymorphism in SQL Part One: Anyelement and Anyarray

By 
Bryn Llewellyn user avatar
Bryn Llewellyn
·
Jul. 16, 20 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
6.3K Views

Join the DZone community and get the full member experience.

Join For Free

The term “polymorphism”, and its inflections, has various meanings in the context of programming languages in general. Roughly, it means “chameleon-like”. Of course, Wikipedia has something to say about it, here. PostgreSQL uses it with this meaning: “Parametric polymorphism: when one or more types are not specified by name but by… symbols that can represent any type”.

The term “variadic” has a universal meaning, and again Wikipedia has something to say about it, here. “A variadic [subprogram] has indefinite “arity”, [that is, it] accepts a variable number of arguments”.

In this two part post, I explain how YugabyteDB stored procedures implement these two central programming language notions.

  • Part One (this post) explains how polymorphism is implemented by declaring a subprogram with formal parameters of the pseudo-types “anyelement” and “anyarray”.
  • Part Two explains how marking a subprogram’s formal parameter with the keyword “variadic” implements that notion. I also bring the two notions together to show you how to implement a non-aggregate version of any built-in aggregate function so that you can use it outside the context of a SQL subquery, just as the “least()” and “greatest()” built-ins already (but uniquely) do this for the “min()” and “max()” aggregate functions.

You can see, then, that I’m stretching the technical meaning of “polymorphism” in the title of this post series. I’m focusing on the chameleon metaphor to capture the idea that I’m showing two powerful techniques that let functions automagically adapt to the way in which they’re invoked.

YugabyteDB is an open source, high-performance distributed SQL database built on a scalable and fault-tolerant design inspired by Google Spanner. YugabyteDB uses its own special distributed document store called DocDB. But it provides SQL and stored procedure functionality by re-using the “upper half” of the standard PostgreSQL source code. This is explained in the two part blog post “Distributed PostgreSQL on a Google Spanner Architecture”: (1) Storage Layer; and (2) Query Layer.

If you’re already used to writing PostgreSQL subprograms in your day job, then I recommend that you read the two parts of my essay in a single short sitting.

Everything that I say in this post applies in the same way to vanilla PostgreSQL, as it does to YugabyteDB. And all the code examples that I describe have identical outcomes in the two environments. There are no special distributed SQL considerations.

Understanding “anyelement” and “anyarray”

A user-defined function is usually created like this:

SQL
 




xxxxxxxxxx
1


 
1
create function f(p1 in int, p2 in boolean, ...)
2
  returns text
3
  immutable
4
  language plpgsql
5
as $body$
6
begin
7
  return <a text value>;
8
end;
9
$body$;



The function’s requirements are stated by specifying how the output value is determined by its input values. And commonly the rules can be stated only in terms of inputs of specified data types. It’s the same story for a user-defined procedure — and the same if the implementation uses “language plpgsql” or “language sql”.

Note: My earlier post, Four Compelling Use Cases for PostgreSQL Table Functions, shows examples of these two ways to implement user-defined functions.

Sometimes, though, the requirements reflect the same general rule with a specialization for each of several different input data types. Here’s a compelling example:

  • Return an N-character text value that represents values of various data types according to rules that are specific to each different data type. The output should be right-justified, and should represent NULL as “<null>”.

A proper requirements statement would specify more detail. But I’ll simply show you two implementations that treat some interesting cases in a natural way. User-defined functions and procedures can be created in overload sets, where each set is for a function or procedure with a certain name and members of a set are distinguished from each other by the the data types of their formal parameters and/or by the number of these. So you could meet the present requirements by creating lots (and lots) of functions all called, say, “display()”. Here’s an example for a “boolean” input:

PLSQL
 




xxxxxxxxxx
1
14


 
1
create function display(val in boolean)
2
  returns text
3
  immutable
4
  language plpgsql
5
as $body$
6
begin
7
  return
8
    case val
9
      when true  then '  true'
10
      when false then ' false'
11
      else            '<null>'
12
    end::text;
13
end;
14
$body$;



And here’s an example for a “bigint” input:

PLSQL
 




xxxxxxxxxx
1
17


 
1
create function display(val in bigint)
2
  returns text
3
  immutable
4
  language plpgsql
5
as $body$
6
begin
7
  return
8
    case
9
      when val is null                     then lpad('<null>', 7)
10
      when abs(val) < 1000                 then lpad(val::text, 7)
11
      when abs(val) between 10^3 and 10^6  then lpad(round(val/10^3)::text||'K', 7)
12
      when abs(val) between 10^6 and 10^9  then lpad(round(val/10^6)::text||'M', 7)
13
      when abs(val) between 10^9 and 10^12 then lpad(round(val/10^9)::text||'G', 7)
14
      else                                      lpad('too big', 7)
15
    end::text;
16
end;
17
$body$;



You can imagine similar overloads for other primitive scalar data types like “numeric”, “double precision”, “money”, “timestamp”, “timestamptz”, “interval” and so on. Maintenance would be simpler if the code for each data type were implemented in a single function. This would optimally self-document, in a single place, which data types it supported. And it could, for example, define a single manifest constant “width” to be used as the second argument for “lpad()” for values of all the supported data types. This is where “anyelement” comes to the rescue.

The anyelement Pseudo-Type

YSQL inherits the PostgreSQL type system. This implements several so-called pseudo-types. A pseudo-type cannot be used as a column data type, or for a local PL/pgSQL variable, but it can be used to declare a function’s formal parameters and result type. This blog post focuses on “anyelement” and “anyarray”.

Here’s the basic proof of concept. Make sure that you first drop any previously existing versions of the function “display()”.

PLSQL
 




xxxxxxxxxx
1
21


 
1
-- Such a user-defined type is known as a "row" type.
2
create type rt as (b boolean, i int);
3

          
4
create function display(val in anyelement)
5
  returns text
6
  immutable
7
  language plpgsql
8
as $body$
9
declare
10
  val_type constant regtype := pg_typeof(val);
11
begin
12
  return
13
    case val_type
14
      when pg_typeof(null::boolean) then '"boolean" input'
15
      when pg_typeof(null::int)     then '"int" input'
16
      when pg_typeof(null::bigint)  then '"bigint" input'
17
      when pg_typeof(null::rt)      then '"rt" input'
18
      else                               'unsupported data type'
19
    end::text;
20
end;
21
$body$;



Test it like this:

PLSQL
 




xxxxxxxxxx
1


 
1
select display(true)
2
union all
3
select display(43)
4
union all
5
select display(567890123456789)
6
union all
7
select display((true, 42)::rt)
8
union all
9
select display(now()::timestamp);



This is the result:

Plain Text
 




xxxxxxxxxx
1


 
1
        display        
2
-----------------------
3
 "boolean" input
4
 "int" input
5
 "bigint" input
6
 "rt" input
7
 unsupported data type



We see, then, that “anyelement” stands for any scalar (that is non-array) data type — simple (like “boolean” or “bigint”) or composite (like a user-defined “row” type). A function with a pseudo-type formal parameter like “anyelement” (or “anyarray”) is called a polymorphic function. We can now use the approach demonstrated by the proof of concept to encapsulate “boolean”, “bigint”, and “rt” implementations of “display()” into a single polymorphic function, like this. As before, make sure that you first drop any previously existing versions of the function “display()”. And if the type “rt” already exists, you can leave in in place.

PLSQL
 




xxxxxxxxxx
1
45


 
1
create type rt as (b boolean, i int);
2

          
3
create function display(val in anyelement)
4
  returns text
5
  immutable
6
  language plpgsql
7
as $body$
8
declare
9
  val_type constant regtype := pg_typeof(val);
10
  pad constant int := 22;
11
begin
12
  case val_type
13
    when pg_typeof(null::boolean) then
14
      return
15
        case val
16
          when true  then lpad('true',   pad)
17
          when false then lpad('false',  pad)
18
          else            lpad('<null>', pad)
19
        end::text;
20

          
21
    when pg_typeof(null::bigint) then
22
      return
23
        case
24
          when val is null then
25
            lpad('<null>', pad)
26
          when abs(val) < 1000 then
27
            lpad(val::text, pad)
28
          when abs(val) between 10^3 and 10^6 then
29
            lpad(round(val/10^3)::text||'K', pad)
30
          when abs(val) between 10^6 and 10^9 then
31
            lpad(round(val/10^6)::text||'M', pad)
32
          when abs(val) between 10^9 and 10^12 then
33
            lpad(round(val/10^9)::text||'G', pad)
34
          else
35
            lpad('too big', pad)
36
        end::text;
37

          
38
    when pg_typeof(null::rt) then
39
      return lpad(val::text, pad);
40

          
41
    else
42
      return lpad('unsupported data type'::text, pad);
43
  end case;
44
end;
45
$body$;



Test it like this:

PLSQL
 




xxxxxxxxxx
1
25


 
1
(with v as (values
2
             (true),
3
            (false),
4
    (null::boolean))
5
select
6
display(column1)
7
from v)
8

          
9
union all
10
(with v as (values
11
              (123),
12
           (234567),
13
        (345678901),
14
     (456789012345),
15
  (567890123456789),
16
     (null::bigint))
17
select
18
display(column1)
19
from v)
20

          
21
union all
22
select display((true, 42)::rt)
23

          
24
union all
25
select display(now());



This is the result:

Plain Text
 




xxxxxxxxxx
1
13


 
1
        display         
2
------------------------
3
                   true
4
                  false
5
                 <null>
6
                    123
7
                   235K
8
                   346M
9
                   457G
10
                too big
11
                 <null>
12
                 (t,42)
13
  unsupported data type



The anyarray Pseudo-Type

This is the natural partner to “anyelement”. It stands for an array of any data type. This simple example demonstrates the idea:

PLSQL
 




xxxxxxxxxx
1


 
1
create function array_display(arr anyarray)
2
  returns setof text
3
  immutable
4
  language sql
5
as $body$
6
  select display(unnest(arr));
7
$body$;



You take advantage, here, of a very useful built-in that materializes the elements of an array as a regular SQL table. Read about this, and more, in the “Array data types and functionality” major section of the YugabyteDB documentation—and, in particular, read about the “unnest()” built-in.

Test it like this:

PLSQL
 




xxxxxxxxxx
1


 
1
select array_display(array[true, false, null]::boolean[])
2
union all
3
select array_display(array[
4
  123, 234567, 345678901, 456789012345, 567890123456789, null]::bigint[])
5
union all
6
select array_display(array[now()]::timestamp[]);



This is the result:

Plain Text
 




xxxxxxxxxx
1
12


 
1
     array_display      
2
------------------------
3
                   true
4
                  false
5
                 <null>
6
                    123
7
                   235K
8
                   346M
9
                   457G
10
                too big
11
                 <null>
12
  unsupported data type



Conclusion

So far, I’ve shown you just one of the two interesting and useful features that this two-part post explains:

  • How to use the “anyelement” and “anyarray” pseudo-types to write a polymorphic subprogram with a dynamically determined polymorphic return value.

YugabyteDB inherits both features from PostgreSQL thanks to its unique architecture that simply uses its SQL processing code as is.

I can’t resist saying here (but I’ll say it again in the conclusion to Part Two) that Oracle Database currently supports neither anything equivalent to the pseudo-types like “anyelement” and “anyarray”, nor the ability to create variadic user-defined subprograms. And, as I never miss a chance to point out, the other distributed SQL databases with which we compare YugabyteDB (CockroachDB, TiDB, and our common inspiration Google Cloud Spanner) don’t support stored procedures at all.

Please go on now to read Part Two. It explains how to write a variadic subprogram.

sql Database Polymorphism (computer science) Data (computing) Plain text Data Types PostgreSQL POST (HTTP)

Published at DZone with permission of Bryn Llewellyn. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Non-blocking Database Migrations
  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • SQL Server to Postgres Database Migration
  • The Generic Way To Convert Between Java and PostgreSQL Enums

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook