oracle非相关子查询处理方式

50阅读 0评论2025-04-26 dingjun123
分类:Oracle

对子查询的优化处理能力,oracle > pg > mysql,mysql的子查询有不少优化,但是还是有很多问题。


子查询的原始语义是:
只返回主表结果,子查询只是用来进行存在性判断。
从这点看,如果子查询要转为普通join,要么join key有唯一键,要么子查询结果按照join key剔重,当然anti join不需要,
一般anti join不会自动改成left join+inner.id is null,但是可以手动实现,这种只查找不匹配的,不需要子查询join key唯一。
像ORACLE的子查询做成view然后hash unqiue就是剔重转为普通join,像MySQL的materialization,loose scan,pull out,dupsweedout都有各种
按照join key对子查询剔重的动作。




子查询原始执行顺序:(只是原始执行顺序,实际上有查询转换)
1)非相关子查询,子查询可以单独执行,然后做成view来驱动主表(做普通join,有剔重)
2)相关子查询,主表的每1行驱动执行子查询,类似循环。当然,这种一般在CBO里是可以转为semi join,join等,
如果不能unnest就是原始语义执行,也有subquery cache等优化。


从原始语义上看,非相关子查询有一定优势,如果子查询不能unnest, 则非相关子查询可以单独执行,这对join reorder就有很大操作空间,
可以子查询单独执行后做成view,结果按照join key剔重,然后将子查询转为普通join,对于子查询不能unnest的有一定优势。


另外子查询如果走nested loops semi/anti,则必须是主表驱动子查询,否则因为子查询join key可能重复,那么结果就不对了。
像hash join,oracle和postgres有专门的hash join right semi/anti,那么就可以让子查询表来做驱动表,MySQL没有这种功能,
MySQL如果要子查询驱动主表,则是内部剔重后改成普通join,比如materialization,loose scan,pull out,dupsweedout,像
firtmatch是主表驱动子查询的,anti join也是。


当然,对于semi join只要将子查询的join key剔重,转为普通join是可以的,{BANNED}最佳好是非相关子查询,相关子查询触发限制的时候则不能转。
对于anti join,则不需要子查询join key必须是唯一的,因为查找不存在的,可以直接用
out_table left join subquery_tab on ..where subquery_tab.id is null来改写。

1)对于非相关子查询,oracle,pg有多种方式优化,就算含有聚合函数等不能unnest,也可以单独执行,这样就可以做成view,然后与主表join来驱动主表,不能unnest的写成非相关子查询有一定优势。


如果返回单行或= subquery,则传值给父查询,
对于pg会有initplan,如果多行,可以单独执行结果集变成view剔重,与主表做普通join优化。


像mysql的非相关子查询,不能unnest的,如果是in,则还是走FILTER主查询循环驱动子查询
(走materiation则有subquery cache,可以减少子查询执行次数)。
如果是=,>,>=,<,<=也可以单独执行,执行计划有subquery in condition; run only once。


##非相关子查询在mysql,pg里挺重要,如果可能,尽量写成非相关子查询。特别是子查询返回1行的,写成=,不要写成in


2)对于相关子查询,不能unnest的,基本都差不多,必须主表循环驱动子查询,只不过ORACLE走FILTER,有subquery cache,
MySQL的相关子查询不能unnest,走intoexists,不能减少子查询执行次数,但是对子查询索引访问的有cache,也可以一定程度提高效率。


相关子查询如果不能unnest,效率低,subquery cache起不到减少子查询执行次数的目的,则{BANNED}最佳好改为非相关子查询,join,left join等


###
当相关子查询有聚合函数,having, set操作,rownum/limit,connect by,分析函数等复杂语法,则相关子查询不能unnest.


###
对于相关子查询包含rownum/limit,pg和mysql可以消除,因为这个本身就没有意义,本身子查询就是firtmatch,而oracle不能消除


子查询只要有聚合函数,having之类的,都不能unnest,就算聚合函数,having其实可以忽略,但是查询转换检查条件没有做这种改写。


单纯的group by,没有聚合函数,相当于distinct,这种本来子查询就是distinct结果,所以可以忽略,不影响unnest。


=========ORACLE子查询
对于ORACLE,非相关子查询,可以单独执行,如果SQL结构确定{BANNED}最佳多只返回一行,则可以作为变量传给父查询,这种不是unnest,
是非相关子查询的原始语义。


在O里,子查询有聚合函数,则相关子查询可能不能unnest,像相关子查询select部分有聚合函数,直接根据语义去掉子查询,比如having有
聚合函数可能不能unnest。


但是如果子查询是非相关的,可以单独执行,如果是确定单行,则可以作为变量传给父查询,这种不是unnest,是非相关子查询的原始语义:
如果返回多行,则可以将子查询改写为view与原表做join。


ORACLE不管in或=,CBO会判断子查询是否返回单行,然后将in改写为=,所以是一样的,这个和MySQL,postgresql不同。

--非相关子查询单行子查询,子查询直接查询出结果给父查询

点击(此处)折叠或打开

  1. select * from a where a.object_id in (select max(b.object_id) from b);

  2. @display

  3. PLAN_TABLE_OUTPUT
  4. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  5. SQL_ID 522hg1mmrz0a4, child number 1
  6. -------------------------------------
  7. select * from a where a.object_id in (select max(b.object_id) from b)

  8. Plan hash value: 742149035

  9. -------------------------------------------------------------------------------------------------------
  10. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
  11. -------------------------------------------------------------------------------------------------------
  12. | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 5 |
  13. | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| A | 1 | 4 | 0 |00:00:00.01 | 5 |
  14. |* 2 | INDEX RANGE SCAN | IDX_A | 1 | 4 | 0 |00:00:00.01 | 5 |
  15. | 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 2 |
  16. | 4 | INDEX FULL SCAN (MIN/MAX) | IDX_B | 1 | 1 | 1 |00:00:00.01 | 2 |
  17. -------------------------------------------------------------------------------------------------------

  18. Predicate Information (identified by operation id):
  19. ---------------------------------------------------

  20.    2 - access("A"."OBJECT_ID"=)

##对于这种,子查询返回不能确定是一行,则可以单独改写为view,然后与a表做join:


点击(此处)折叠或打开

  1. select * from a where a.object_id in (select max(b.object_id) from b group by object_type);

  2. Elapsed: 00:00:00.02

  3. Execution Plan
  4. ----------------------------------------------------------
  5. Plan hash value: 1790892509

  6. -----------------------------------------------------------------------------------------
  7. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  8. -----------------------------------------------------------------------------------------
  9. | 0 | SELECT STATEMENT | | 177 | 31683 | 672 (2)| 00:00:01 |
  10. | 1 | NESTED LOOPS | | 177 | 31683 | 672 (2)| 00:00:01 |
  11. | 2 | NESTED LOOPS | | 180 | 31683 | 672 (2)| 00:00:01 |
  12. | 3 | VIEW | VW_NSO_1 | 45 | 585 | 401 (2)| 00:00:01 |
  13. | 4 | HASH UNIQUE | | 45 | 675 | 401 (2)| 00:00:01 |
  14. | 5 | HASH GROUP BY | | 45 | 675 | 401 (2)| 00:00:01 |
  15. | 6 | TABLE ACCESS FULL | B | 73044 | 1069K| 396 (1)| 00:00:01 |
  16. |* 7 | INDEX RANGE SCAN | IDX_A | 4 | | 2 (0)| 00:00:01 |
  17. | 8 | TABLE ACCESS BY INDEX ROWID| A | 4 | 664 | 6 (0)| 00:00:01 |
  18. -----------------------------------------------------------------------------------------

  19. Predicate Information (identified by operation id):
  20. ---------------------------------------------------

  21.    7 - access("A"."OBJECT_ID"="MAX(B.OBJECT_ID)")

这种非相关子查询<,>的,估算不准,因为不知道子查询返回的具体值,按照默认选择性5%估算,如果是join,貌似oracle可以计算了,可能通过直方图计算的,算法未知

点击(此处)折叠或打开

  1. select num_rows from dba_tab_statistics where table_name='A' and owner=user;

  2.   NUM_ROWS
  3. ----------
  4.     292172
  5.  
  6.   估算为: 292172*0.05= 14609

  7. select count(*) from a where a.object_id <= (select max(b.object_id) from b where object_name like 'A%');

  8. PLAN_TABLE_OUTPUT
  9. ----------------------------------------------------------------------------------------------------
  10. SQL_ID cw3jcfgarmpuq, child number 1
  11. -------------------------------------
  12. select count(*) from a where a.object_id <= (select max(b.object_id)
  13. from b where object_name like 'A%')

  14. Plan hash value: 3936981519

  15. ----------------------------------------------------------------------------------------
  16. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
  17. ----------------------------------------------------------------------------------------
  18. | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.05 | 4152 |
  19. | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.05 | 4152 |
  20. |* 2 | INDEX RANGE SCAN | IDX_A | 1 | 14609 | 292K|00:00:00.04 | 4152 |
  21. | 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1422 |
  22. |* 4 | TABLE ACCESS FULL| B | 1 | 1448 | 2716 |00:00:00.01 | 1422 |
  23. ----------------------------------------------------------------------------------------

  24. Predicate Information (identified by operation id):
  25. ---------------------------------------------------

  26.    2 - access("A"."OBJECT_ID"<=)
  27.    4 - filter("OBJECT_NAME" LIKE 'A%')
  28.    
  29.    
  30.  select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

  31. PLAN_TABLE_OUTPUT
  32. ----------------------------------------------------------------------------------------------------
  33. SQL_ID f6nhfvnrqb479, child number 0
  34. -------------------------------------
  35. select count(*) from a where a.object_id <= (select b.object_id from b
  36. where object_name like 'A%' and rownum=1)

  37. Plan hash value: 914045866

  38. ----------------------------------------------------------------------------------------
  39. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
  40. ----------------------------------------------------------------------------------------
  41. | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 8 |
  42. | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 8 |
  43. |* 2 | INDEX RANGE SCAN | IDX_A | 1 | 14609 | 332 |00:00:00.01 | 8 |
  44. |* 3 | COUNT STOPKEY | | 1 | | 1 |00:00:00.01 | 4 |
  45. |* 4 | TABLE ACCESS FULL| B | 1 | 1 | 1 |00:00:00.01 | 4 |
  46. ----------------------------------------------------------------------------------------

  47. Predicate Information (identified by operation id):
  48. ---------------------------------------------------

  49.    2 - access("A"."OBJECT_ID"<=)
  50.    3 - filter(ROWNUM=1)
  51.    4 - filter("OBJECT_NAME" LIKE 'A%')

等值的按照均匀分布计算

点击(此处)折叠或打开

  1. select count(*) from a where a.object_id = (select max(b.object_id) from b where object_name like 'A%');

  2.   COUNT(*)
  3. ----------
  4.          4

  5. 1 row selected.

  6. Elapsed: 00:00:00.01
  7. dingjun123@ORCLPDB> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

  8. PLAN_TABLE_OUTPUT
  9. ----------------------------------------------------------------------------------------------------
  10. SQL_ID a2v0fwb12rqt9, child number 0
  11. -------------------------------------
  12. select count(*) from a where a.object_id = (select max(b.object_id)
  13. from b where object_name like 'A%')

  14. Plan hash value: 3936981519

  15. -------------------------------------------------------------------------------------------------
  16. | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
  17. -------------------------------------------------------------------------------------------------
  18. | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 1425 | 1 |
  19. | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1425 | 1 |
  20. |* 2 | INDEX RANGE SCAN | IDX_A | 1 | 4 | 4 |00:00:00.01 | 1425 | 1 |
  21. | 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1422 | 0 |
  22. |* 4 | TABLE ACCESS FULL| B | 1 | 1448 | 2716 |00:00:00.01 | 1422 | 0 |
  23. -------------------------------------------------------------------------------------------------

  24. Predicate Information (identified by operation id):
  25. ---------------------------------------------------

  26.    2 - access("A"."OBJECT_ID"=)
  27.    4 - filter("OBJECT_NAME" LIKE 'A%')


  28. 23 rows selected.

  29. 计算方式:
  30.  select num_rows from dba_tab_statistics where table_name='A' and owner=user;

  31.   NUM_ROWS
  32. ----------
  33.     292172
  34.       
  35. select num_distinct,density from dba_tab_col_statistics where table_name='A' and column_name='OBJECT_ID' and owner=user;

  36. NUM_DISTINCT DENSITY
  37. ------------ ----------
  38.        74144 .000013487

  39. 估算为4行
  40.  select 292172 * .000013487 from dual;

  41. 292172*.000013487
  42. -----------------
  43.        3.94052376





上一篇:Oracle 12c index by rowid batched特性
下一篇:MySQL不支持view jppd下推