Oracle In-Memory in Practice
Oracle In-Memory in Practice
In this post, we go over the basic of architecture of an in-memory database, and then give a quick tutorial on how to use it with SQL.
Join the DZone community and get the full member experience.Join For Free
Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph™ is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.
What’s the Benefit?
Faster query performance. Queries which include group by operations (analytic queries) ran 4-27 times faster than before with zero code changes. Queries which are highly selective, and include a predicate in the WHERE clause, gain the biggest benefits. In one case, a complex query intended for online analytics which took 11 seconds was completed in 399 milliseconds using OIM.
How Does it Work?
The diagram below illustrates how it works. On a system without OIM (on the left), the user submits a query which reads from the Buffer Cache. If the data is not available in memory, it triggers physical read operations, and results are returned to the user.
With OIM (on the right), if all the data needed is held in memory, it's executed entirely against the in-memory cache. Queries can transparently combine data from the In-Memory and Buffer cache.
Where to Start?
- Set up Oracle in-memory: A simple Oracle parameter described below.
- Decide which Tables, Partitions and Columns to keep in memory: Typically those most often queried by analytic queries. The Oracle Diagnostics & Tuning Pack may help to identify these. Initially, it’s sensible to hold entire tables in memory, but later remove infrequently used columns to save space. Assume around 50% compression on Fact/Transaction tables with much better on Dimension tables.
- Decide upon a refresh policy: By default data is cached upon the first read, although it’s possible to set one of five priorities to load data into memory (see below).
- Decide upon a data compression level: Which varies from Query Low (default) to Capacity High to save space at a slight performance loss.
- Test and Deploy: Enable in memory against selected tables, columns and partitions, and test performance gains with real data before deploying to production.
In reality, only steps 1 and 5 are needed, as defaults are acceptable for an initial deployment. Like any performance tuning exercise, the process should be iterative until the necessary improvements are gained.
Setting up Oracle In-Memory
Assuming you have the necessary license, you can set it up (in this case with a 20Mb space) using:
Alter system set inmemory size = 20G scope=spfile;.
You can verify you have OIM working by checking the Oracle parameters:
show parameter inmemory NAME VALUE ------------------------------------------- inmemory_size 20G inmemory_query ENABLE
To mark tables, partitions, and columns in memory use:
-- Hold the F_SALES table entirely in memory
alter table f_sales INMEMORY;
-- Load F_SALES in memory but exclude a partition
alter table f_sales INMEMORY modify partition YR_2011 no inmemory;
-- Load D_CUSTOMER table in memory excluding columns
alter table d_customer inmemory no inmemory (cust_class, cust_name)
-- Hold the F_SALES table entirely in memory alter table f_sales INMEMORY; -- Load F_SALES in memory but exclude a partition alter table f_sales INMEMORY modify partition YR_2011 no inmemory; -- Load D_CUSTOMER table in memory excluding columns alter table d_customer inmemory no inmemory (cust_class, cust_name)
It's also possible to alter both the compression method and refresh priority:
alter table f_sales inmemory memcompress for capacity high priority low;
Verify Data in Memory
You can list the segments in-memory including the actual and physical storage used, including the compression ratio, using the SQL below. This shows the data you're requesting is actually in memory and verifies it's size. You can also check for specific columns in memory using the V$IM_COLUMN_LEVEL view.
select segment_name , partition_name , inmemory_size / 1024 / 1024 as inmemory_size_mb , bytes / 1024 / 1024 as bytes_mb , populate_status , trunc(bytes / inmemory_size, 1) * 100 as compression_ratio from v$im_segments order by segment_name, partition_name;
The query above lists the segments (table or partitions) currently in memory along with the actual physical space used, current status (whether it's being refreshed), and the compression ratio. For example:
Segment Partition Total Used Status Compress % F_SALES SALES_2012 326 258 COMPLETED 70 F_SALES SALES_2013 321 200 COMPLETED 60 F_SALES SALES_2014 50 100 COMPLETED 50
The biggest single way to save memory is to exclude tables or columns. For large partitioned tables you could consider holding the most frequent queried partitions in-memory, and, for very wide tables, exclude infrequently queried columns. Be aware, however, if every column in your query is not "in-memory," Oracle defaults to the buffer cache.
Finally, you should consider adjusting the compression ratio. Increasing the compression ratio saves space at a relatively minor cost of additional processing.
- No memcompress: Data is held without compression.
- Memcompress for DML: Is mainly for DML performance and minimum compression.
- Memcompress for Query Low: Is optimized for query performance (Default).
- Memcompress for Query High: Is a balance of query performance and space.
- Memcompress for Capacity Low: For additional compression.
- Memcompress for Capacity High: Optimized for space over performance.
What's the Load Priority?
The load priority determines the sequence in which tables are loaded into memory upon start-up or when tables are pinned or removed. Similarly, while Oracle guarantees read consistency, modified data will be refreshed in the in-memory store based upon the Load Priority which can be set at table or column level.
- Critical: Data is populated immediately.
- High: After all critical data is loaded.
- Medium: After high priority data is loaded.
- Low: After medium data is loaded.
- None: Data is only populated when it's read for the first time.
Did My Query Use In-Memory?
The best way to check if a given query used in memory is to use the ORACLE AUTOTRACE feature, and check the Query Plan. You may be forgiven for thinking zero physical reads indicates success, but this is potentially misleading.
Query 1: The Default Situation
The example below shows the result of counting a 20 million row table. Notice the physical read count is zero as the data was already in the buffer cache, but the elapsed time was still over 13 seconds. This demonstrates the true benefit of OIM to speed performance.
set autotrace traceonly select count(*) from f_sales elapsed 00:00:13.59 PLAN TABLE OUTPUT —————————————————————————————————————— 1 SELECT STATEMENT 2 SORT AGGREGATE 3 TABLE ACCESS FULL F_SALES —————————————————————————————————————— Statistics 0 recursive reads 0 db block gets 70,108 Consistent reads 0 physical reads 349 bytes sent to client 372 bytes received from client 2 SQL*Net Round Trips 0 sorts (memory) 0 sorts (disk) 1 rows processed
Query 2: Using Oracle In-Memory
The query below was executed using OIM, and similarly demonstrates zero physical reads, but is nearly four times faster. In this case, the entire query used just 3 Consistent (in-memory) reads, compared to over 70,000 without OIM.
select count(*) from f_sales elapsed 00:00:03.83 PLAN TABLE OUTPUT —————————————————————————————————————— 1 SELECT STATEMENT 2 SORT AGGREGATE 3 TABLE ACCESS INMEMORY FULL F_SALES —————————————————————————————————————— Statistics 0 recursive reads 0 db block gets 3 Consistent reads 0 physical reads 349 bytes sent to client 372 bytes received from client 2 SQL*Net Round Trips 0 sorts (memory) 0 sorts (disk) 1 rows processed
In the above examples, the second used OIM demonstrating:
- Both queries had zero physical reads.
- Both performed a FULL TABLE SCAN, but the second was INMEMORY.
- The in-memory query completed in 3.8 seconds - around 4 times faster.
- The estimated cost of the in-memory query is 23 times lower, at 822 instead of nearly 20,000.
Impact of the Where Clause
The table below shows how the elapsed time varies while querying a 20m table with increasing levels of selectivity.
It shows that the more selective the query, the greater the speed boost provided by OIM. For example, when a highly selective query is executed (returning just 270 rows), the result returns 42 times faster using OIM. This is because of:
select count(*), max(item_value) from f_sales where sale_type = 'RETAIL' and region = 'UK' and date between '01-JAN-16' and '31-JAN-16' and item_price > 10
Repeatedly executing the above query to increase the selectivity produced the following results:
|Row Selectivity||Default Elapsed (Secs)||OIM Elapsed (Secs)||
As the graph below illustrates, as queries are more selective, the performance gains of OIM increase. For example, a query which returned 75 rows from a 20m row table ran 69 times faster using Oracle In-Memory.
Clearly, if you're managing an Oracle-based Business Intelligence system, Oracle in-memory option gives some compelling results on analytic queries. With performance gains of up to 67 times faster (on highly selective queries), and the ability to tune both memory usage and compression, it's an impressive tool to maximize query performance. Best of all, it's also entirely transparent to the application — except it's faster.
Be aware, however, it's not a panacea for performance problems, and poorly designed systems may not gain any significant benefits, but on a well-designed data warehouse solution, it may well be worth the investment.
Keep in mind, you may need to upgrade your physical memory on the machine, otherwise, it's a great leap forward for sub-second analytics.
Thanks for reading this article. If you found this helpful, you can view more articles on Big Data, Cloud Computing, Database Architecture and the future of data warehousing on my web site www.Analytics.Today.
Published at DZone with permission of John Ryan , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.