{{announcement.body}}
{{announcement.title}}

Attribute Clustering for Performance

DZone 's Guide to

Attribute Clustering for Performance

In this article, take a look at an example of instructing the database to co-locate data during loading to facilitate reduced resource consumption for subsequent queries.

· Database Zone ·
Free Resource

One of the nice facilities in 12c is the ability to instruct the database to co-locate data during loading in order to facilitate reduced resource consumption for subsequent queries.

Here’s an example of it in use. First I’ll create table SOURCE_DATA to mimic some randomized incoming data feed

SQL
 




x
19


1
SQL> exec dbms_random.seed(0)
2
 
          
3
PL/SQL procedure successfully completed.
4
 
          
5
SQL> drop table source_data purge;
6
drop table source_data purge
7
           *
8
ERROR at line 1:
9
ORA-00942: table or view does not exist
10
 
          
11
 
          
12
SQL>
13
SQL> create table source_data  as
14
  2      select d.*
15
  3      from dba_objects d
16
  4      where object_id is not null
17
  5      order by dbms_random.random;
18
 
          
19
Table created.
15
  3      from dba_objects d



So let’s now populate (create) our table T with a straight copy of the data from SOURCE_DATA and index a column of interest

SQL
 







You can see the data is in pseudo-random order on both OBJECT_ID and OWNER. (We only care about OBJECT_ID for the sake of this demo). Let us now see what the I/O of an index range scan query on OBJECT_ID costs us.

SQL
 




xxxxxxxxxx
1
23


 
1
SQL> set autotrace traceonly stat
2
SQL> select /*+ index(t) */ * from t where object_id between 10000 and 15000;
3
 
          
4
4997 rows selected.
5
 
          
6
 
          
7
Statistics
8
----------------------------------------------------------
9
          1  recursive calls
10
          0  db block gets
11
       5339  consistent gets
12
         12  physical reads
13
          0  redo size
14
     629403  bytes sent via SQL*Net to client
15
       4214  bytes received via SQL*Net from client
16
        335  SQL*Net roundtrips to/from client
17
          0  sorts (memory)
18
          0  sorts (disk)
19
       4997  rows processed
20
 
          
21
SQL> set autotrace off
22
SQL>
23
SQL>



Just under 5400 consistent gets, which is not dissimilar from the rows we fetched because those rows from OBJECT_ID 10,000 to 15,000 they are scattered throughout the table. Let us now repeat the exercise, this time using the clustering facilities available under 12c.

SQL
 




xxxxxxxxxx
1
92


 
1
SQL>
2
SQL> drop table t purge;
3
 
          
4
Table dropped.
5
 
          
6
SQL>
7
SQL> create table t
8
  2  clustering by linear order(object_id)
9
  3  as select * from source_data;
10
 
          
11
Table created.
12
 
          
13
SQL> select object_id, owner from t where rownum < 50;
14
 
          
15
 OBJECT_ID OWNER
16
---------- ------------------------------
17
         2 SYS
18
         3 SYS
19
         4 SYS
20
         5 SYS
21
         6 SYS
22
         7 SYS
23
         8 SYS
24
         9 SYS
25
        10 SYS
26
        11 SYS
27
        12 SYS
28
        13 SYS
29
        14 SYS
30
        15 SYS
31
        16 SYS
32
        17 SYS
33
        18 SYS
34
        19 SYS
35
        20 SYS
36
        21 SYS
37
        22 SYS
38
        23 SYS
39
        24 SYS
40
        25 SYS
41
        26 SYS
42
        27 SYS
43
        28 SYS
44
        29 SYS
45
        30 SYS
46
        31 SYS
47
        32 SYS
48
        33 SYS
49
        34 SYS
50
        35 SYS
51
        36 SYS
52
        37 SYS
53
        38 SYS
54
        39 SYS
55
        40 SYS
56
        41 SYS
57
        42 SYS
58
        43 SYS
59
        44 SYS
60
        45 SYS
61
        46 SYS
62
...
63
...
64
 
          
65
SQL> create index t_idx on t(object_id);
66
 
          
67
Index created.
68
 
          
69
SQL>
70
SQL> set autotrace traceonly stat
71
SQL> select /*+ index(t) */ * from t where object_id between 10000 and 15000;
72
 
          
73
4997 rows selected.
74
 
          
75
 
          
76
Statistics
77
----------------------------------------------------------
78
          1  recursive calls
79
          0  db block gets
80
        756  consistent gets
81
         12  physical reads
82
          0  redo size
83
     629403  bytes sent via SQL*Net to client
84
       4214  bytes received via SQL*Net from client
85
        335  SQL*Net roundtrips to/from client
86
          0  sorts (memory)
87
          0  sorts (disk)
88
       4997  rows processed
89
 
          
90
SQL> set autotrace off
91
SQL>
92
SQL>



Before we even used the index, you could see from the sample of rows we queried from the table, the data has been stored in an ordered sequence. As a result, our index lookup is dramatically improved, dropping down to just ~750 consistent gets. The full details on Attribute Clustering is covered in great detail in the Data Warehousing Guide.

Topics:
clustering, database, database administration and management, oracle, oracle database, performance and scalability

Published at DZone with permission of Connor McDonald , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}