問(wèn)題描述
檢查問(wèn)題此 SELECT 查詢需要 180 秒才能完成(檢查問(wèn)題本身的評(píng)論).
IN只能與一個(gè)值進(jìn)行比較,但時(shí)間差異仍然很大.
為什么會(huì)這樣?
總結(jié):這是一個(gè) MySQL 中的>已知問(wèn)題,并在 MySQL 5.6.x 中修復(fù).該問(wèn)題是由于使用 IN 的子查詢被錯(cuò)誤地識(shí)別為依賴子查詢而不是獨(dú)立子查詢時(shí)缺少優(yōu)化.
<小時(shí)>當(dāng)您對(duì)原始查詢運(yùn)行 EXPLAIN 時(shí),它會(huì)返回:
<前>1 'PRIMARY' 'question_law_version' 'ALL' '' '' '' '' 10148 '使用哪里'2 'DEPENDENT SUBQUERY' 'question_law_version' 'ALL' '' '' '' '' 10148 '使用地點(diǎn)'3 'DEPENDENT SUBQUERY' 'question_law' 'ALL' '' '' '' '' 10040 '使用哪里'當(dāng)您將 IN
更改為 =
時(shí),您會(huì)得到:
每個(gè)依賴子查詢?cè)谒牟樵冎械拿恳恍羞\(yùn)行一次,而子查詢只運(yùn)行一次.當(dāng)存在可以轉(zhuǎn)換為連接的條件時(shí),MySQL 有時(shí)可以優(yōu)化依賴子查詢,但這里并非如此.
現(xiàn)在這當(dāng)然留下了為什么 MySQL 認(rèn)為 IN 版本需要是依賴子查詢的問(wèn)題.我制作了一個(gè)簡(jiǎn)化版的查詢來(lái)幫助調(diào)查這個(gè)問(wèn)題.我創(chuàng)建了兩個(gè)表foo"和bar",其中前者只包含一個(gè) id 列,后者包含一個(gè) id 和一個(gè) foo id(盡管我沒(méi)有創(chuàng)建外鍵約束).然后我用 1000 行填充了兩個(gè)表:
CREATE TABLE foo (id INT PRIMARY KEY NOT NULL);CREATE TABLE bar (id INT PRIMARY KEY, foo_id INT NOT NULL);-- 在每個(gè)表中填充 1000 行選擇 ID從 foo在哪里(選擇最大(foo_id)發(fā)件人欄);
這個(gè)簡(jiǎn)化的查詢有和之前一樣的問(wèn)題——內(nèi)部選擇被當(dāng)作依賴子查詢處理,沒(méi)有進(jìn)行優(yōu)化,導(dǎo)致內(nèi)部查詢每行運(yùn)行一次.查詢幾乎需要一秒鐘才能運(yùn)行.再次將 IN
更改為 =
幾乎可以立即運(yùn)行查詢.
我用來(lái)填充表格的代碼如下,以防有人希望重現(xiàn)結(jié)果.
CREATE TABLE 填充符 (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT) 引擎=內(nèi)存;分隔符 $$創(chuàng)建程序 prc_filler(cnt INT)開始聲明 _cnt INT;設(shè)置_cnt = 1;而 _cnt <= cnt 做插入INTO填料選擇_cnt;SET _cnt = _cnt + 1;結(jié)束時(shí);結(jié)尾$$分隔符;呼叫 prc_filler(1000);INSERT foo SELECT id FROM fills;INSERT bar SELECT id, id FROM 填充符;
Check the question This SELECT query takes 180 seconds to finish (check the comments on the question itself).
The IN get to be compared against only one value, but still the time difference is enormous.
Why is it like that?
Summary: This is a known problem in MySQL and was fixed in MySQL 5.6.x. The problem is due to a missing optimization when a subquery using IN is incorrectly indentified as dependent subquery instead of an independent subquery.
When you run EXPLAIN on the original query it returns this:
1 'PRIMARY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 2 'DEPENDENT SUBQUERY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 3 'DEPENDENT SUBQUERY' 'question_law' 'ALL' '' '' '' '' 10040 'Using where'
When you change IN
to =
you get this:
1 'PRIMARY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 2 'SUBQUERY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 3 'SUBQUERY' 'question_law' 'ALL' '' '' '' '' 10040 'Using where'
Each dependent subquery is run once per row in the query it is contained in, whereas the subquery is run only once. MySQL can sometimes optimize dependent subqueries when there is a condition that can be converted to a join but here that is not the case.
Now this of course leaves the question of why MySQL believes that the IN version needs to be a dependent subquery. I have made a simplified version of the query to help investigate this. I created two tables 'foo' and 'bar' where the former contains only an id column, and the latter contains both an id and a foo id (though I didn't create a foreign key constraint). Then I populated both tables with 1000 rows:
CREATE TABLE foo (id INT PRIMARY KEY NOT NULL);
CREATE TABLE bar (id INT PRIMARY KEY, foo_id INT NOT NULL);
-- populate tables with 1000 rows in each
SELECT id
FROM foo
WHERE id IN
(
SELECT MAX(foo_id)
FROM bar
);
This simplified query has the same problem as before - the inner select is treated as a dependent subquery and no optimization is performed, causing the inner query to be run once per row. The query takes almost one second to run. Changing the IN
to =
again allows the query to run almost instantly.
The code I used to populate the tables is below, in case anyone wishes to reproduce the results.
CREATE TABLE filler (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=Memory;
DELIMITER $$
CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
DECLARE _cnt INT;
SET _cnt = 1;
WHILE _cnt <= cnt DO
INSERT
INTO filler
SELECT _cnt;
SET _cnt = _cnt + 1;
END WHILE;
END
$$
DELIMITER ;
CALL prc_filler(1000);
INSERT foo SELECT id FROM filler;
INSERT bar SELECT id, id FROM filler;
這篇關(guān)于為什么 IN 條件比“="慢?在 sql 中?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!