{{announcement.body}}
{{announcement.title}}

Same Dog, Different Leash — Functions in SQL

DZone 's Guide to

Same Dog, Different Leash — Functions in SQL

Let's explore functions in SQL.

· Database Zone ·
Free Resource

Let’s start with this somewhat odd-looking result. I have an inline function that returns a random number between 0 and 20, and I call that for each row in ALL_OBJECTS and then I extract just those rows for which the generated random number is 10. Seems simple enough….but why do I get results for which the value of the second column is most certainly not 10?

SQL> with
  2    function
  3      getrand(pval in number) return number is
  4    begin
  5      return round(dbms_random.value(0,20));
  6    end;
  7  my_rnd as
  8    ( select object_id, getrand(object_id) getrnd from all_objects  )
  9  select *
 10  from my_rnd
 11  where getrnd = 10;
 12  /

 OBJECT_ID     GETRND
---------- ----------
        29          1
        53         10
        42          8
        30          3
        78         16
        87         18
        89          1
       145         12
       155         13
       175         15
       183         12
       198         15
       ...
       ...

This is just a new variation of the same old theme — determinism.

There is no guarantee when and how many times the database will execute a function within a single invocation of a SQL statement, or even how many times it will be executed for a single candidate row the SQL statement is processing. The optimizer is totally entitled to shift it to anywhere in the plan. What does this mean? It means only deterministic functions make sense in SQL. Let’s model the above with a simpler example:

SQL> create table t as select * from dba_objects where rownum <= 20;

Table created.

SQL> create or replace
  2  function xgetrand(pval in number) return number is
  3    begin
  4      return round(dbms_random.value(0,20));
  5    end;
  6  /

Function created.

SQL> select *
  2  from (
  3    select object_id, xgetrand(object_id) gg from t
  4  )
  5  where gg = 10;

 OBJECT_ID         GG
---------- ----------
        38          2
         6          8

Immediately that looks odd because the SQL reads like “I only want GG=10” yet I got back “2” and “8”. But if we drill into the execution plan, we’ll see this:

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     3 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |     3 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("XGETRAND"("OBJECT_ID")=10)

Notice the FILTER information. The query has been transformed to be a simple:

SQL> select object_id, xgetrand(object_id) gg from t
  2  where xgetrand(object_id) = 10;

If you’re doubtful of that claim, we can validate it by dumping a 10053 trace. In that you’ll find the following:

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "T"."OBJECT_ID" "OBJECT_ID","XGETRAND"("T"."OBJECT_ID") "GG" FROM "T" "T" WHERE "XGETRAND"("T"."OBJECT_ID")=10

Already you can see the scope for the function being called twice per row — once for the WHERE clause, and once for the SELECT part. In fact, for all we know, it could be called three times or four times. You (the coder) does not have control over that decision. For the sake of discussion, let’s assume it is called only twice. The first execution of the function (on line 2 above) returned 10 twice across the 20 rows (because we got 2 rows back in the result set), but then we ran the function again (on line1) as we gave back each row, hence the counter-intuitive output.

In the original example, ALL_OBJECTS is being used as a source for rows, which is a complex view. The function call could be pushed anywhere deep into this view, which means it might be executed tens of thousands of times, and many of those executions might return a result of “10”.

The bottom line remains the same: non-deterministic means trouble for functions in SQL.

Topics:
database ,tutorial ,sql ,functions in sql

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}