为了减少回表过滤次数,减少逐行回表,减少表数据块访问次数和index clustering factor影响,
oracle 12c推出了table access by index rowid batched,可以批量rowid回表,先索引访问,查出rowid,然后排序,批量回表查找,避免多次回表过滤和index clustering factor影响多次查询表数据块...
这时候,就不是逐行回表了,而且批量回表可能会导致索引不能消除排序,还有批量回表,索引查找可以使用并行,下面分析12c的batched回表技术,解析其作用,和单行回表相比的优缺点
一、 技术原理与作用
index rowid batched机制
传统单行回表:12c之前,通过索引获取ROWID后,每次仅处理一个ROWID并立即回表读取对应数据块。若多个ROWID指向同一数据块,会导致该块被多次访问,增加I/O开销(尤其是索引聚集因子低时)。
批量回表:12c后,数据库会先收集一批ROWID(如通过索引范围扫描),按数据块号排序后批量访问表数据块。例如,若一批ROWID中包含多个指向同一数据块的记录,只需一次物理读取即可获取所有相关行,减少重复访问。
优化数据块访问顺序:
通过ROWID排序,使回表操作按数据块物理存储顺序进行,提升I/O效率(接近顺序读而非随机读),尤其适合数据分散的场景。
并行化支持:
批量回表可结合并行查询(Parallel Query),允许多个进程同时处理不同批次的ROWID,进一步提升大数据量查询的效率。
二、与单行回表的对比分析
优点:
减少物理I/O次数:
通过合并同一数据块的访问,显著降低回表时的块读取次数。例如,若10个ROWID指向同一块,单行回表需10次I/O,而批量回表仅需1次。
降低CPU资源消耗:
批量处理减少了上下文切换和锁竞争,提升CPU利用率。
优化器灵活性增强:
即使索引的聚集因子较低(数据分散),优化器仍可能选择索引扫描而非全表扫描,避免全表扫描的高成本。
缺点:
额外排序开销:
对ROWID按数据块排序需要消耗CPU资源,可能抵消部分性能收益,尤其在数据本身已有序时(如高聚集因子索引)。
可能无法消除排序操作:
若查询需要按特定顺序返回结果(如ORDER BY),批量回表可能导致索引无法直接消除排序,需额外排序步骤。
内存占用增加:
批量处理需在PGA中暂存ROWID,可能增加内存压力,尤其在处理海量数据时。
三、适用场景与优化建议
适用场景
低聚集因子索引:数据分散导致单行回表I/O开销高时,批量回表效果显著。
大范围查询:需回表大量数据时(如数万行以上),批量处理优势更明显。
OLAP场景:结合并行查询处理复杂分析任务。
优化建议
覆盖索引:若查询列全部包含在索引中,可避免回表(无需使用批量技术)。
调整隐藏参数:通过_optimizer_batch_table_access_by_rowid控制是否启用批量回表(默认true),在特定场景下关闭以测试性能差异。
监控执行计划:关注TABLE ACCESS BY INDEX ROWID BATCHED的成本和物理读次数,结合AWR报告分析实际收益。
四、与其他数据库的对比
MySQL MRR(Multi-Range Read):类似Oracle的批量回表,MySQL通过MRR将索引扫描得到的无序主键排序后批量回表,减少随机I/O。
PostgreSQL:目前无内置批量回表功能,需依赖索引覆盖或手动优化。
下面看看oracle下的index rowid batched具体表现:
建表如下:
点击(此处)折叠或打开
-
drop table a;
-
create table a as select * from dba_objects order by dbms_random.value;
- create index idx_a on a(object_id);
-- 默认全表
select * from a where object_id<1000 and status='VALID';
------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 997 |00:00:00.01 | 1500 |
|* 1 | TABLE ACCESS FULL| A | 1 | 997 | 997 |00:00:00.01 | 1500 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("OBJECT_ID"<1000 AND "STATUS"='VALID'))
Note
-----
- statistics feedback used for this statement
-- 强制走索引可以看到有:TABLE ACCESS BY INDEX ROWID BATCHED
点击(此处)折叠或打开
-
select/*+index(a)*/ * from a where object_id<1000 and status='VALID';
-
-
-
997 rows selected.
-
-
-
Plan hash value: 1989604121
-
-
----------------------------------------------------------------------------------------------------------------
-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
-
----------------------------------------------------------------------------------------------------------------
-
| 0 | SELECT STATEMENT | | 1 | | 997 |00:00:00.01 | 1067 | 4 |
-
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| A | 1 | 452 | 997 |00:00:00.01 | 1067 | 4 |
-
|* 2 | INDEX RANGE SCAN | IDX_A | 1 | 904 | 997 |00:00:00.01 | 71 | 4 |
-
----------------------------------------------------------------------------------------------------------------
-
-
Predicate Information (identified by operation id):
-
---------------------------------------------------
-
-
1 - filter("STATUS"='VALID')
- 2 - access("OBJECT_ID"<1000)
-- 11g没有这个特性:
点击(此处)折叠或打开
-
select/*+opt_param('optimizer_features_enable' '11.2.0.4') index(a)*/
-
* from a where object_id<1000 and status='VALID';
-
-
Plan hash value: 1100842037
-
-
-----------------------------------------------------------------------------------------------
-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-
-----------------------------------------------------------------------------------------------
-
| 0 | SELECT STATEMENT | | 1 | | 997 |00:00:00.01 | 1067 |
-
|* 1 | TABLE ACCESS BY INDEX ROWID| A | 1 | 452 | 997 |00:00:00.01 | 1067 |
-
|* 2 | INDEX RANGE SCAN | IDX_A | 1 | 904 | 997 |00:00:00.01 | 71 |
-
-----------------------------------------------------------------------------------------------
-
-
Predicate Information (identified by operation id):
-
---------------------------------------------------
-
-
1 - filter("STATUS"='VALID')
-
2 - access("OBJECT_ID"<1000)
-
- 21 rows selected.
-- index batched 不能消除排序,所以如果有排序,一般计划不走index batched
点击(此处)折叠或打开
-
select * from a where object_id<1000 and status='VALID'
-
order by object_id desc;
-
-
Plan hash value: 782229976
-
-
------------------------------------------------------------------------------------------------
-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-
------------------------------------------------------------------------------------------------
-
| 0 | SELECT STATEMENT | | 1 | | 997 |00:00:00.01 | 156 |
-
|* 1 | TABLE ACCESS BY INDEX ROWID | A | 1 | 452 | 997 |00:00:00.01 | 156 |
-
|* 2 | INDEX RANGE SCAN DESCENDING| IDX_A | 1 | 904 | 997 |00:00:00.01 | 71 |
-
------------------------------------------------------------------------------------------------
-
-
Predicate Information (identified by operation id):
-
---------------------------------------------------
-
-
1 - filter("STATUS"='VALID')
- 2 - access("OBJECT_ID"<1000)
-
-- batch不能消除排序,下面的SQL使用hints:BATCH_TABLE_ACCESS_BY_ROWID强制走batched,有sort order by
点击(此处)折叠或打开
-
select/*+ BATCH_TABLE_ACCESS_BY_ROWID(a) */ * from a where object_id<1000 and status='VALID'
- order by object_id desc;
Plan hash value: 2038020579
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 997 |00:00:00.01 | 23 | | | |
| 1 | SORT ORDER BY | | 1 | 452 | 997 |00:00:00.01 | 23 | 178K| 178K| 158K (0)|
|* 2 | TABLE ACCESS BY INDEX ROWID BATCHED| A | 1 | 452 | 997 |00:00:00.01 | 23 | | | |
|* 3 | INDEX RANGE SCAN DESCENDING | IDX_A | 1 | 904 | 997 |00:00:00.01 | 4 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("STATUS"='VALID')
3 - access("OBJECT_ID"<1000)
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 997 |00:00:00.01 | 23 | | | |
| 1 | SORT ORDER BY | | 1 | 452 | 997 |00:00:00.01 | 23 | 178K| 178K| 158K (0)|
|* 2 | TABLE ACCESS BY INDEX ROWID BATCHED| A | 1 | 452 | 997 |00:00:00.01 | 23 | | | |
|* 3 | INDEX RANGE SCAN DESCENDING | IDX_A | 1 | 904 | 997 |00:00:00.01 | 4 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("STATUS"='VALID')
3 - access("OBJECT_ID"<1000)
-- 使用batched回表,在回表阶段可以并行。。。索引访问还是顺序查找
点击(此处)折叠或打开
-
select/*+ parallel(8)*/ * from a where object_id<1000 and status='VALID'
-
;
-
-
-
Elapsed: 00:00:00.09
-
-
Execution Plan
-
----------------------------------------------------------
-
Plan hash value: 1337928157
-
-
-------------------------------------------------------------------------------------------------------------------------------
-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-
-------------------------------------------------------------------------------------------------------------------------------
-
| 0 | SELECT STATEMENT | | 452 | 59664 | 24 (0)| 00:00:01 | | | |
-
| 1 | PX COORDINATOR | | | | | | | | |
-
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 452 | 59664 | 24 (0)| 00:00:01 | Q1,01 | P->S | QC (RAND) |
-
|* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| A | 452 | 59664 | 24 (0)| 00:00:01 | Q1,01 | PCWP | |
-
| 4 | BUFFER SORT | | | | | | Q1,01 | PCWC | |
-
| 5 | PX RECEIVE | | 904 | | 3 (0)| 00:00:01 | Q1,01 | PCWP | |
-
| 6 | PX SEND HASH (BLOCK ADDRESS) | :TQ10000 | 904 | | 3 (0)| 00:00:01 | Q1,00 | S->P | HASH (BLOCK|
-
| 7 | PX SELECTOR | | | | | | Q1,00 | SCWC | |
-
|* 8 | INDEX RANGE SCAN | IDX_A | 904 | | 3 (0)| 00:00:01 | Q1,00 | SCWP | |
-
-------------------------------------------------------------------------------------------------------------------------------
-
-
Predicate Information (identified by operation id):
-
---------------------------------------------------
-
-
3 - filter("STATUS"='VALID')
- 8 - access("OBJECT_ID"<1000)
总结
Oracle 12c的批量回表技术通过ROWID排序和批量访问,显著降低了回表操作的I/O开销,尤其适用于低聚集因子索引和大范围查询场景。尽管可能引入额外排序成本,但其整体性能提升在多数场景下仍优于传统单行回表。开发者和DBA应结合执行计划分析与参数调优,{BANNED}最佳大化利用该特性优势。