对Hash Join的一次优化 [转帖]_MySQL, Oracle及数据库讨论区_Weblogic技术|Tuxedo技术|中间件技术|Oracle论坛|JAVA论坛|Linux/Unix技术|hadoop论坛

总帖数

每页帖数

1/1页

返回列表

发起投票

查看: 3380 | 回复: 0

主题： 对Hash Join的一次优化 [转帖]

Gavin

注册用户

等级：少校
经验：878
发帖：130
精华：1
注册：2011-7-21
状态：离线
发送短消息息给Gavin

加好友发送短消息息给Gavin

发消息

发表于：

2011-8-2 9:06:52 | [全部帖] [楼主帖]

楼主

转自：http://www.itpub.net/viewthread.php?tid=955209&extra=page%3D5%26amp%3Bfilter%3Ddigest

对Hash Join的一次优化

前两天解决了一个优化SQL的case，SQL语句如下，big_table为150G大小，small_table很小，9000多条记录，不到1M大小

hash_area_size, sort_area_size均设置足够大，可以进行optimal hash join和memory sort

select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where (a.category = b.from_cat or
a.category2 = b.from_cat) and
a.site_id = b.site_id and
a.sale_end >= sysdate;

PHP code:

--------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 2 | 174 | 18 (17)|

| 1 | SORT UNIQUE | | 2 | 174 | 18 (17)|

|* 2 | HASH JOIN | | 2 | 174 | 17 (12)|

| 3 | TABLE ACCESS FULL | SMALL_TABLE | 1879 | 48854 | 14 (8)|

|* 4 | TABLE ACCESS FULL | BIG_TABLE | 4 | 244 | 3 (34)|

--------------------------------------------------------------------------

粗略来看，PLAN非常的完美，SQL HINT写的也很到位，小表在内build hash table,大表在外进行probe操作，

根据经验来看，整个SQL执行的时间应该和FTS BIG_TABLE的时间差不多

但是FTS BIG_TABLE的时间大约是8分钟，而真个SQL执行的时间长达3~4小时

那么问题究竟出在哪里?

FTS时间应该不会有太大变化，那么问题应该在hash join，设置event来trace一下hash join的过程。

SQL> alter session set events '10104 trace name context forever, level 2';
Session altered.
select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where (a.category = b.from_cat or
a.category2 = b.from_cat) and
a.site_id = b.site_id and
a.sale_end >= sysdate;

从trace file中Hash Table中这一段找出了问题所在：

### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 16373
Number of buckets with 1 rows: 0
Number of buckets with 2 rows: 0
Number of buckets with 3 rows: 1
Number of buckets with 4 rows: 0
Number of buckets with 5 rows: 0
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 1
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 1
Number of buckets with between 20 and 29 rows: 1
Number of buckets with between 30 and 39 rows: 3
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 4
### Hash table overall statistics ###
Total buckets: 16384 Empty buckets: 16373 Non-empty buckets: 11
Total number of rows: 9232
Maximum number of rows in a bucket: 2531
Average number of rows in non-empty buckets: 839.272705

仔细看，在一个bucket中最多的行数竟然有2531行，因为bucket中是一个链表的结构，所以这几千行都是串在一个链表上。

由这一点想到这个Hash Table所依赖的hash key的distinct value可能太少，重复值太多。否则不应该会有这么多行在同一个bucket里面。

因为Join条件里面有两个列from_cat和site_id,穷举法有三种情况

1. Build hash table based on (from_cat,site_id):

SQL> select site_id,from_cat,count(*) from SMALL_TABLE group by site_id,from_cat having count(*)>100;
no rows selected
2. Build hash table based on (from_cat):
SQL> select from_cat,count(*) from SMALL_TABLE group by from_cat having count(*)>100;
no rows selected
3. Build hash table based on (site_id):
SQL> select site_id,count(*) from SMALL_TABLE group by site_id having count(*)>100;
SITE_ID COUNT(*)
---------- ----------
0 2531
2 2527
146 1490
210 2526

到这里可以发现，基于site_id这种情况和trace file中这两行很相符：

Number of buckets with 100 or more rows: 4
Maximum number of rows in a bucket: 2531

所以推断这个hash table是基于site_id而建的，而Big_Table中大量的行site_id=0，都落在这个linked list最长的bucket中.

而大部分行都会扫描完整个链表而最后被丢弃掉,所以这个Hash Join的操作效率非常差，几乎变为了Nest Loop操作

找到了根本原因，问题也就迎刃而解了。

理想状况下，hash table应当建立于(site_id,from_cat)上，那么问题肯定出在这个OR上，把OR用UNION改写

select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where a.category = b.from_cat and
a.site_id = b.site_id and
a.sale_end >= sysdate
UNION
select /*+ leading(b) use_hash(a b) */ distinct a.ID
from BIG_TABLE a, SMALL_TABLE b
where a.category2 = b.from_cat and
a.site_id = b.site_id and
a.sale_end >= sysdate;

PHP code:

--------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 2 | 148 | 36 (59)|

| 1 | SORT UNIQUE | | 2 | 148 | 36 (59)|

| 2 | UNION-ALL | | | | |

|* 3 | HASH JOIN | | 1 | 74 | 17 (12)|

| 4 | TABLE ACCESS FULL| SMALL_TABLE | 1879 | 48854 | 14 (8)|

|* 5 | TABLE ACCESS FULL| BIG_TABLE | 4 | 192 | 3 (34)|

|* 6 | HASH JOIN | | 1 | 74 | 17 (12)|

| 7 | TABLE ACCESS FULL| SMALL_TABLE | 1879 | 48854 | 14 (8)|

|* 8 | TABLE ACCESS FULL| BIG_TABLE | 4 | 192 | 3 (34)|

--------------------------------------------------------------------------

初看这个PLAN好像不如第一个PLAN，因为执行了两次BIG_TABLE的FTS，但是让我们在来看看HASH TABLE的结构

### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 9306
Number of buckets with 1 rows: 5310
Number of buckets with 2 rows: 1436
Number of buckets with 3 rows: 285
Number of buckets with 4 rows: 43
Number of buckets with 5 rows: 4
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 0
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 0
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
### Hash table overall statistics ###
Total buckets: 16384 Empty buckets: 9306 Non-empty buckets: 7078
Total number of rows: 9232
Maximum number of rows in a bucket: 5
Average number of rows in non-empty buckets: 1.304323

这就是我们所需要的Hash Table，最长的链表只有五行数据

整个SQL的执行时间从三四个小时缩短为16分钟，大大超出了developer的预期

这个SQL单纯从PLAN上很难看出问题所在，需要了解Hash Join的机制，进行更深一步的分析

本版精华
热门帖子

操作引用/回复

总帖数

每页帖数

1/1页

返回列表

用户登录

Weblogic中间件技术论坛

Tuxedo中间件技术论坛

数据库论坛

Java论坛

Linux/unix论坛

网站地图