MySQL query select * from a where id in (select id from b) How to improve efficiency?

For this sql statement execution plan is not really its first check out all the id b table, and then compared with the id a table.
mysql will exists in the sub-query into sub-queries associated, so it is actually equivalent to this sql statement: select * from a where exists ( select * from b where b.id = a.id);

Principle exists performs correlated subquery is: one record per cycle taken b a table comparing the table, that the conditions are relatively a.id = b.id see a table id of each record exists in the table b. , if there is a return line of the table this record.

What exists inquiry drawbacks?
From the implementation of the principle exists, a table (look) can not use the index, you must be a full table scan, because the data is to get a table to table to check b. And it must use a data table to the check table b (the outer in the table), the fixed order is dead.

How to optimize?
To build the index. But by the above analysis, the index can only be built b id field built in the table, not on the id a table, mysql use is not on.

Such optimized enough? Some worse.
Because it exists query execution plan can only hold a data table to table b check (the outside to the inside of the table), although the query efficiency can be improved in the id field of construction index b table.
But it does not turn, b holding data table to check a table, query order exists subqueries is fixed dead.

Why should I turn?
Because first of all it is certain that in turn the result is the same. This in turn leads to a more detailed question: when the id field in both tables both built on the index, in the end is a high efficiency b table check table, or b table check the efficiency of a high table?

How to further optimize?
The modified inner join query join query: select * from a inner join b on a.id = b.id; ( but they are not enough, then look down)

Why not left join and right join?
This time serially connected between the live table is fixed,

For example, the left is the connection must first check the left table full table scan, and then one by one to another table to the query, the right connections empathy. Still not the best choice.

Why use inner join on it?
inner join the two tables, such as: a inner join b, but the actual implementation of the order is not a cent to do with the wording of the order, the final execution may also be connected b a, the order is not fixed dead. If the conditions on the indexed field, the index can also be used on.

Then how can we know what kind of a and b higher order of execution efficiency?
A: You do not know, I do not know. who knows? mysql know. Let mysql judge for yourself (query optimizer). Join orders and specific situations using the index table, mysql query optimizer cost assessment will be made in each case, the final choice as the best of the implementation plan.

Connecting the inner join, mysql on their own assessment of the efficient use of a table of the check table or high b b a table search table, if both tables are built on the index case, mysql will also evaluate the condition field uses a table high efficiency index table or b.

And we have to do is: the two fields are connected condition of the two tables are established on each index, and then explain it to view the execution plan, see mysql in the end use of which index, and finally the field without the use of an index table index to remove the line.

 

Extended problem:

It seems we can not say with efficiency inner join would certainly be in high bar. I just did a test, here are my two sql statements:
the SELECT * from elanw_client in the WHERE CUser (the SELECT name from userinf)
the SELECT * from elanw_client A Inner ON a.cUser the Join userinf b = b.name 
first sql when used to 5.4s, the second with the sql Shique to 8.7s

answer:

Not to say that inner join is more efficient than in, your statement in this rewrite semi join, and semi join and inner join in all cases the results are not equivalent, semi join means that as long as there is the right table row data matching returns, inner join will have a left-table row for each match the right table to meet all of the data, if there is no index on the join key, then the semi join is in fast normal, rewrite what good is it to inner join? Because the inner join around the table can be exchanged, the optimizer can choose an optimal statistical information by connecting the order, but the order of semi join connection is fixed, the optimizer can not choose to join order, if you just write queries left large data table, and the table is small, this is a bad connection order, the optimizer can not be exchanged, semi join can be rewritten to precondition the inner join is a primary key or linkage unique index

 

Almost reference sources known problem: https://www.zhihu.com/question/20699147

Published 80 original articles · won praise 96 · views 360 000 +

Guess you like

Origin blog.csdn.net/Alen_xiaoxin/article/details/104773638