Oracle Turf

Rajiv Iyer's observations on Oracle Performance

Cardinality Estimation, precompute_subquery hint

with 2 comments

Recently, I posted in Oracle List to check if somebody could help me with a workaround for a query where cardinality estimates were significantly different from the actual rows processed. Jonathan Lewis suggested rewriting the sql into a subquery and using the hint “precompute_subquery”. Not surprising, it worked like a charm.

Here is a small test case in 11.2.0.1 that demonstrates the problem & workaround.

Create BILLING and LOCATION tables

--create billing
create table billing as
with vw as
(
select rownum id from dual connect by level <= 1000
)
select
rownum bill_id,
'active',
case
when rownum <= 50000 then
996
else
case when mod(rownum,50000)=0 then
990
when mod(rownum,5000)=0 then
991
when mod(rownum,500)=0 then
992
when mod(rownum,50)=0 then
993
when mod(rownum,5)=0 then
994
else
995
end
end location_id
from vw, vw
where rownum <= 100000;

--create location
create table location as
select location_id, 'X-'||rownum location_name
from
(
select distinct location_id from billing
) order by location_id;

--Gather Stats
exec dbms_stats.gather_table_stats(user,'BILLING');
exec dbms_stats.gather_table_stats(user,'LOCATION');
 

Data distribution

select a.location_id,b.location_name,count(*) from billing a, location b
where a.location_id = b.location_id
group by a.location_id,b.location_name
order by 3;

LOCATION_ID LOCATION_NAME   COUNT(*)
----------- ------------- ----------
        990 X-4                    1
        991 X-3                    9
        992 X-2                   90
        993 X-1                  900
        994 X-6                 9000
        995 X-5                40000
        996 X-7                50000

Plan for a query looking for billing details related to location ‘X-3’.

select * from billing a, location b where a.location_id = b.location_id
and b.location_name = 'X-3'

Plan hash value: 725447293

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |      1 |        |      9 |00:00:00.03 |     228 |       |       |          |
|*  1 |  HASH JOIN         |          |      1 |  14286 |      9 |00:00:00.03 |     228 |  1236K|  1236K|  414K (0)|
|*  2 |   TABLE ACCESS FULL| LOCATION |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|   3 |   TABLE ACCESS FULL| BILLING  |      1 |    100K|    100K|00:00:00.01 |     225 |       |       |          |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."LOCATION_ID"="B"."LOCATION_ID")
2 - filter("B"."LOCATION_NAME"='X-3')

The “Hash Join” operation shows that the optimizer estimated 14286 rows whereas the actual rows processed were just 9. That’s because the optimizer has its limitations. It can estimate only based on the available statistics. In this case it calculated
Cardinality = num of rows/num of distinct values = 100000/7 = 14286 (rounded)

Rewriting the query and using the “precompute_subquery” hint and additionally creating a histogram gives us a much better cardinality estimation.

exec dbms_stats.gather_table_stats(user,'BILLING',method_opt='FOR COLUMNS LOCATION_ID SIZE AUTO');

explain plan for
select * from billing where location_id in (select /*+ precompute_subquery */ location_id from location where location_name = 'X-3');

Plan hash value: 2877307530

-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |    53 |   583 |    66   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BILLING |    53 |   583 |    66   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("LOCATION_ID"=991)

Our test case dealt with only 2 tables but with many tables such similar scenarios can lead to less desirable execution plans. The “precompute_subquery” offers a good workaround. It is not clear yet if the hint is documented. Checking with Oracle Support may help.

Advertisements

Written by rajiviyer

August 30, 2013 at 10:46 pm

Posted in Optimizer

2 Responses

Subscribe to comments with RSS.

  1. Quite interesting finding on undocumented hint

    Harmandeep Singh

    October 14, 2013 at 11:30 am

  2. Yes Harman. Indeed there are tons of unexplored options which can help us during tricky situations.

    rajiviyer

    October 14, 2013 at 12:19 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: