Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Oracle In-Memory in Practice

DZone's Guide to

Oracle In-Memory in Practice

In this post, we go over the basic of architecture of an in-memory database, and then give a quick tutorial on how to use it with SQL.

· Database Zone ·
Free Resource

Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.  

What’s the Benefit?

Faster query performance. Queries which include group by operations (analytic queries) ran 4-27 times faster than before with zero code changes. Queries which are highly selective, and include a predicate in the WHERE clause, gain the biggest benefits. In one case, a complex query intended for online analytics which took 11 seconds was completed in 399 milliseconds using OIM.

How Does it Work?

The diagram below illustrates how it works. On a system without OIM (on the left), the user submits a query which reads from the Buffer Cache. If the data is not available in memory, it triggers physical read operations, and results are returned to the user.

With OIM (on the right), if all the data needed is held in memory, it's executed entirely against the in-memory cache. Queries can transparently combine data from the In-Memory and Buffer cache.

Image title

Where to Start?

  1. Set up Oracle in-memory: A simple Oracle parameter described below.
  2. Decide which Tables, Partitions and Columns to keep in memory: Typically those most often queried by analytic queries. The Oracle Diagnostics & Tuning Pack may help to identify these. Initially, it’s sensible to hold entire tables in memory, but later remove infrequently used columns to save space. Assume around 50% compression on Fact/Transaction tables with much better on Dimension tables.
  3. Decide upon a refresh policy: By default data is cached upon the first read, although it’s possible to set one of five priorities to load data into memory (see below).
  4. Decide upon a data compression level: Which varies from Query Low (default) to Capacity High to save space at a slight performance loss.
  5. Test and Deploy: Enable in memory against selected tables, columns and partitions, and test performance gains with real data before deploying to production.

In reality, only steps 1 and 5 are needed, as defaults are acceptable for an initial deployment. Like any performance tuning exercise, the process should be iterative until the necessary improvements are gained.

Setting up Oracle In-Memory

Assuming you have the necessary license, you can set it up (in this case with a 20Mb space) using:

Alter system set inmemory size = 20G scope=spfile;.

You can verify you have OIM working by checking the Oracle parameters:

show parameter inmemory

NAME                              VALUE
-------------------------------------------
inmemory_size                     20G
inmemory_query                    ENABLE

To mark tables, partitions, and columns in memory use:

-- Hold the F_SALES table entirely in memory
alter table f_sales INMEMORY;

-- Load F_SALES in memory but exclude a partition
alter table f_sales INMEMORY modify partition YR_2011 no inmemory; 

-- Load D_CUSTOMER table in memory excluding columns
alter table d_customer inmemory no inmemory (cust_class, cust_name)

It's also possible to alter both the compression method and refresh priority:

alter table f_sales
    inmemory
    memcompress for capacity high
    priority low;

Verify Data in Memory

You can list the segments in-memory including the actual and physical storage used, including the compression ratio, using the SQL below. This shows the data you're requesting is actually in memory and verifies it's size. You can also check for specific columns in memory using the V$IM_COLUMN_LEVEL view.

select segment_name
,    partition_name
,    inmemory_size / 1024 / 1024 as inmemory_size_mb
,    bytes / 1024 / 1024 as bytes_mb
,    populate_status
,    trunc(bytes / inmemory_size, 1) * 100 as compression_ratio
from v$im_segments
order by segment_name, partition_name;

The query above lists the segments (table or partitions) currently in memory along with the actual physical space used, current status (whether it's being refreshed), and the compression ratio. For example:

Segment      Partition   Total Used  Status    Compress %
F_SALES      SALES_2012   326   258  COMPLETED     70
F_SALES      SALES_2013   321   200  COMPLETED     60
F_SALES      SALES_2014    50   100  COMPLETED     50

Controlling Compression

The biggest single way to save memory is to exclude tables or columns. For large partitioned tables you could consider holding the most frequent queried partitions in-memory, and, for very wide tables, exclude infrequently queried columns. Be aware, however, if every column in your query is not "in-memory," Oracle defaults to the buffer cache.

Finally, you should consider adjusting the compression ratio. Increasing the compression ratio saves space at a relatively minor cost of additional processing.

Options include:

  • No memcompress: Data is held without compression.
  • Memcompress for DML: Is mainly for DML performance and minimum compression.
  • Memcompress for Query Low: Is optimized for query performance (Default).
  • Memcompress for Query High: Is a balance of query performance and space.
  • Memcompress for Capacity Low: For additional compression.
  • Memcompress for Capacity High: Optimized for space over performance.

What's the Load Priority?

The load priority determines the sequence in which tables are loaded into memory upon start-up or when tables are pinned or removed. Similarly, while Oracle guarantees read consistency, modified data will be refreshed in the in-memory store based upon the Load Priority which can be set at table or column level.

Options include:

  • Critical: Data is populated immediately.
  • High: After all critical data is loaded.
  • Medium: After high priority data is loaded.
  • Low: After medium data is loaded.
  • None: Data is only populated when it's read for the first time.

Did My Query Use In-Memory?

The best way to check if a given query used in memory is to use the ORACLE AUTOTRACE feature, and check the Query Plan. You may be forgiven for thinking zero physical reads indicates success, but this is potentially misleading.

Query 1: The Default Situation

The example below shows the result of counting a 20 million row table. Notice the physical read count is zero as the data was already in the buffer cache, but the elapsed time was still over 13 seconds. This demonstrates the true benefit of OIM to speed performance.

set autotrace traceonly

select count(*)
from f_sales

elapsed 00:00:13.59

PLAN TABLE OUTPUT
——————————————————————————————————————
1     SELECT STATEMENT
2          SORT AGGREGATE
3          TABLE ACCESS FULL F_SALES
——————————————————————————————————————
Statistics
     0 recursive reads
     0 db block gets
70,108 Consistent reads
     0 physical reads
   349 bytes sent to client
   372 bytes received from client
     2 SQL*Net Round Trips
     0 sorts (memory)
     0 sorts (disk)
     1 rows processed

Query 2: Using Oracle In-Memory

The query below was executed using OIM, and similarly demonstrates zero physical reads, but is nearly four times faster. In this case, the entire query used just 3 Consistent (in-memory) reads, compared to over 70,000 without OIM.

select count(*)
from f_sales

elapsed 00:00:03.83

PLAN TABLE OUTPUT
——————————————————————————————————————
1     SELECT STATEMENT
2          SORT AGGREGATE
3          TABLE ACCESS INMEMORY FULL F_SALES
——————————————————————————————————————
Statistics
     0 recursive reads
     0 db block gets
     3 Consistent reads
     0 physical reads
   349 bytes sent to client
   372 bytes received from client
     2 SQL*Net Round Trips
     0 sorts (memory)
     0 sorts (disk)
     1 rows processed

In the above examples, the second used OIM demonstrating:

  • Both queries had zero physical reads.
  • Both performed a FULL TABLE SCAN, but the second was INMEMORY.
  • The in-memory query completed in 3.8 seconds - around 4 times faster.
  • The estimated cost of the in-memory query is 23 times lower, at 822 instead of nearly 20,000.

Impact of the Where Clause

The table below shows how the elapsed time varies while querying a 20m table with increasing levels of selectivity.

It shows that the more selective the query, the greater the speed boost provided by OIM. For example, when a highly selective query is executed (returning just 270 rows), the result returns 42 times faster using OIM. This is because of:

select count(*), max(item_value)
from f_sales
where sale_type = 'RETAIL'
and  region = 'UK'
and date between '01-JAN-16' and '31-JAN-16'
and item_price > 10

Repeatedly executing the above query to increase the selectivity produced the following results:

Row Selectivity Default Elapsed (Secs) OIM Elapsed (Secs)

Difference %

20m 19.81 18.89 5%
8.5m 9.38 7.78 17%
2.8m 3.54 1.89 46%
1.3m 3.02 1.15 2.6 times
224k 2.33 0.21 11 times
50k 2.2 0.07 31 times
8k 1.59 0.05 31.8 times
275 2.08 0.05 41.6 times
75 2.07 0.03 69 times

As the graph below illustrates, as queries are more selective, the performance gains of OIM increase. For example, a query which returned 75 rows from a 20m row table ran 69 times faster using Oracle In-Memory.

Image title

Summary

Clearly, if you're managing an Oracle-based Business Intelligence system, Oracle in-memory option gives some compelling results on analytic queries. With performance gains of up to 67 times faster (on highly selective queries), and the ability to tune both memory usage and compression, it's an impressive tool to maximize query performance. Best of all, it's also entirely transparent to the application — except it's faster.

Be aware, however, it's not a panacea for performance problems, and poorly designed systems may not gain any significant benefits, but on a well-designed data warehouse solution, it may well be worth the investment.

Keep in mind, you may need to upgrade your physical memory on the machine, otherwise, it's a great leap forward for sub-second analytics.

Thank You

Thanks for reading this article.  If you found this helpful, you can view more articles on Big Data, Cloud Computing, Database Architecture and the future of data warehousing on my web site www.Analytics.Today.

Download AnzoGraph now and find out for yourself why it is acknowledged as the most complete all-in-one data warehouse for BI style and graph analytics.  

Topics:
oracle ,memory cache ,database ,database performance ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}