8.2.1.10 Nested-Loop Join Algorithms解释-gladness-ChinaUnix博客

8.2.1.10 Nested-Loop Join Algorithms

MySQL executes joins between tables using a nested-loop algorithm or variations on it.

Nested-Loop Join Algorithm

A simple nested-loop join (NLJ) algorithm reads rows from the first table in a loop one at a time, passing each row to a nested loop that processes the next table in the join. This process is repeated as many times as there remain tables to be joined.

Assume that a join between three tables t1, t2, and t3 is to be executed using the following join types:

Table   Join Type
t1      range
t2      ref
t3      ALL

If a simple NLJ algorithm is used, the join is processed like this:

for each row in t1 matching range {
  for each row in t2 matching reference key {
    for each row in t3 {
      if row satisfies join conditions,
      send to client
    }
  }
}

Because the NLJ algorithm passes rows one at a time from outer loops to inner loops, it typically reads tables processed in the inner loops many times.
nested-loop join顾名思义，嵌套循环；几个表join，就有几层循环；
试个3表join的例子：
mysql> show create table one\G
*************************** 1. row ***************************
Table: one
Create Table: CREATE TABLE `one` (
`id` int(11) DEFAULT NULL,
`name` varchar(10) DEFAULT NULL,
KEY `i_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> show create table two\G
*************************** 1. row ***************************
Table: two
Create Table: CREATE TABLE `two` (
`id2` int(11) DEFAULT NULL,
`name2` varchar(10) DEFAULT NULL,
KEY `i_name2` (`name2`),
KEY `i_id2` (`id2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> show create table three\G
*************************** 1. row ***************************
Table: three
Create Table: CREATE TABLE `three` (
`id3` int(11) DEFAULT NULL,
`name3` varchar(10) DEFAULT NULL,
KEY `i_id3` (`id3`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> select * from one
-> ;
+------+--------+
| id | name |
+------+--------+
| 1 | abcdef |
| 2 | cc |
| 3 | dd |
+------+--------+
3 rows in set (0.00 sec)

mysql> select * from two;
+------+-------+
| id2 | name2 |
+------+-------+
| 1 | dd |
| 2 | sss |
| 3 | cc |
+------+-------+
3 rows in set (0.00 sec)

mysql> select * from three;
+------+------------+
| id3 | name3 |
+------+------------+
| 3 | sss |
| 1 | rrrr |
| 2 | aaaaaaa |
| 4 | accccaaaaa |
| 5 | acddddccca |
| 6 | a11111 |
| 7 | a111112222 |
+------+------------+
7 rows in set (0.00 sec)

mysql> explain select id3,name3 from one,two,three where one.name=two.name2 and two.id2=three.id3;
+----+-------------+-------+------+---------------+--------+---------+------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+--------+---------+------------------+------+-------------+
| 1 | SIMPLE | two | ALL | i_name2,i_id2 | NULL | NULL | NULL | 2 | Using where |
| 1 | SIMPLE | one | ref | i_name | i_name | 13 | testdb.two.name2 | 1 | Using index |
| 1 | SIMPLE | three | ref | i_id3 | i_id3 | 5 | testdb.two.id2 | 1 | NULL |
+----+-------------+-------+------+---------------+--------+---------+------------------+------+-------------+
3 rows in set (0.00 sec)
三层循环，最外层是two，然后是one，最内层是three；

mysql> select id3,name3 from one,two,three where one.name=two.name2 and two.id2=three.id3;
+------+-------+
| id3 | name3 |
+------+-------+
| 1 | rrrr |
| 3 | sss |
+------+-------+
2 rows in set (0.04 sec)

Block Nested-Loop Join Algorithm

A Block Nested-Loop (BNL) join algorithm uses buffering of rows read in outer loops to reduce the number of times that tables in inner loops must be read. For example, if 10 rows are read into a buffer and the buffer is passed to the next inner loop, each row read in the inner loop can be compared against all 10 rows in the buffer. The reduces the number of times the inner table must be read by an order of magnitude.

MySQL uses join buffering under these conditions:

The join_buffer_size system variable determines the size of each join buffer.
Join buffering can be used when the join is of type ALL or index (in other words, when no possible keys can be used, and a full scan is done, of either the data or index rows, respectively), or range. In MySQL 5.6, use of buffering is extended to be applicable to outer joins, as described in Section 8.2.1.14, “Block Nested-Loop and Batched Key Access Joins”.
One buffer is allocated for each join that can be buffered, so a given query might be processed using multiple join buffers.
A join buffer is never allocated for the first nonconst table, even if it would be of type ALL or index.
A join buffer is allocated prior to executing the join and freed after the query is done.
Only columns of interest to the join are stored in the join buffer, not whole rows.

For the example join described previously for the NLJ algorithm (without buffering), the join is done as follow using join buffering:

for each row in t1 matching range {
  for each row in t2 matching reference key {
    store used columns from t1, t2 in join buffer
    if buffer is full {
      for each row in t3 {
        for each t1, t2 combination in join buffer {
          if row satisfies join conditions,
          send to client
        }
      }
      empty buffer
    }
  }
}

if buffer is not empty {
  for each row in t3 {
    for each t1, t2 combination in join buffer {
      if row satisfies join conditions,
      send to client
    }
  }
}

If S is the size of each stored t1, t2 combination is the join buffer and C is the number of combinations in the buffer, the number of times table t3 is scanned is:

(S * C)/join_buffer_size + 1

The number of t3 scans decreases as the value of join_buffer_size increases, up to the point whenjoin_buffer_size is large enough to hold all previous row combinations. At that point, there is no speed to be gained by making it larger.

接着上面举的例子，再举一个例子；
mysql> explain select id3,name3 from one,two,three where one.name=two.name2 and two.id2=three.id3;
+----+-------------+-------+------+---------------+--------+---------+------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+--------+---------+------------------+------+-------------+
| 1 | SIMPLE | two | ALL | i_name2,i_id2 | NULL | NULL | NULL | 2 | Using where |
| 1 | SIMPLE | one | ref | i_name | i_name | 13 | testdb.two.name2 | 1 | Using index |
| 1 | SIMPLE | three | ref | i_id3 | i_id3 | 5 | testdb.two.id2 | 1 | NULL |
+----+-------------+-------+------+---------------+--------+---------+------------------+------+-------------+
3 rows in set (0.13 sec)

mysql> alter table one drop key i_name;
Query OK, 0 rows affected (0.59 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> explain select id3,name3 from one,two,three where one.name=two.name2 and two.id2=three.id3;
+----+-------------+-------+------+---------------+-------+---------+----------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+----------------+------+----------------------------------------------------+
| 1 | SIMPLE | one | ALL | NULL | NULL | NULL | NULL | 2 | NULL |
| 1 | SIMPLE | two | ALL | i_name2,i_id2 | NULL | NULL | NULL | 2 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | three | ref | i_id3 | i_id3 | 5 | testdb.two.id2 | 1 | NULL |
+----+-------------+-------+------+---------------+-------+---------+----------------+------+----------------------------------------------------+
3 rows in set (0.11 sec)
读取one表的所有记录，放入join buffer；
循环读取two表记录，每读一条，到join buffer寻找匹配的记录；
如果在join buffer中找到匹配的记录，进入下一重循环，即利用索引i_id3读取three表的循环；
如果在three表中读取到匹配的记录，则把这一条结果发送到网络层接口；
继续各级循环；