In mysql, limit can achieve fast paging, but if the data reaches several million, our limit must be optimized to achieve paging effectively and reasonably, otherwise your server may be stuck.
It becomes a problem when a table data has millions of data!
Such as * from table limit 0,10 this is no problem, when the limit is 200000,10, the data reading is very slow, you can solve it according to the following methods
first page will be soon
At PERCONA PERFORMANCE CONFERENCE 2009, several engineers from Yahoo brought a report on "EfficientPagination Using MySQL"
The meaning of limit10000,20 is to scan 10020 lines that meet the conditions, throw away the first 10000 lines, and return the last 20 lines. The problem is here.
LIMIT 451350 , 30 Scanned more than 450,000 lines, no wonder the slow ones were blocked.
but
A statement like limit 30 scans only 30 rows.
Then if we recorded the maximum ID before, we can make a fuss here
for example
Daily paging SQL statement
select id,name,content from users order by id asc limit 100000,20
Scan 100020 lines
If the last maximum ID was recorded
select id,name,content from users where id>100073 order by id asc limit 20
Scan 20 lines.
The total data is about 5 million
The following example when select * from wl_tagindex where byname='f' order by id limit 300000,10 execution time is 3.21s
Optimized:
select * from (
select id from wl_tagindex
where byname=’f’ order by id limit 300000,10
) a
left join wl_tagindex b on a.id=b.id
The execution time is 0.11s, and the speed is significantly improved
What needs to be explained here is that the fields I use here are byname and id. These two fields need to be used as composite indexes, otherwise the effect will not be improved significantly.
Summarize
When a database table is too large and the offset value in LIMIT offset and length is too large, the SQL query statement will be very slow, you need to increase the order by, and the order by field needs to be indexed.
If you use subqueries to optimize LIMIT, the subqueries must be continuous. In a sense, the subqueries should not have where conditions, where will filter the data and make the data lose continuity.
If the records you query are relatively large and the amount of data transmission is relatively large, such as fields containing text types, you can create subqueries.
SELECT id,title,content FROM items WHERE id IN (SELECT id FROM items ORDER BY id limit 900000, 10);
If the offset of the limit statement is large, you can reduce the offset = 0 by passing the pk key value, the primary key is preferably of type int and auto_increment
SELECT * FROM users WHERE uid > 456891 ORDER BY uid LIMIT 0, 10;
This sentence, to the effect, is as follows:
SELECT * FROM users WHERE uid >= (SELECT uid FROM users ORDER BY uid limit 895682, 1) limit 0, 10;
If the offset value of the limit is too large, the user will also be tired of turning pages. You can set a maximum offset. If it exceeds, it can be handled separately. Generally, the continuous page turning is too large and the user experience is very poor. You should provide a better user experience. user.
limit paging optimization method
1. Subquery optimization method
First find the first piece of data, and then the id greater than or equal to this piece of data is the data to be obtained
Disadvantages: The data must be continuous. It can be said that there cannot be a where condition. The where condition will filter the data and cause the data to lose continuity.
under the experiment
mysql> set profi=1;
Query OK, 0 rows affected (0.00 sec)
mysql> select count(*) from Member;
+———-+
| count(*) |
+———-+
| 169566 |
+———-+
1 row in set (0.00 sec)
mysql> pager grep !~-
PAGER set to ‘grep !~-‘
mysql> select * from Member limit 10, 100;
100 rows in set (0.00 sec)
mysql> select * from Member where MemberID >= (select MemberID from Member limit 10,1) limit 100;
100 rows in set (0.00 sec)
mysql> select * from Member limit 1000, 100;
100 rows in set (0.01 sec)
mysql> select * from Member where MemberID >= (select MemberID from Member limit 1000,1) limit 100;
100 rows in set (0.00 sec)
mysql> select * from Member limit 100000, 100;
100 rows in set (0.10 sec)
mysql> select * from Member where MemberID >= (select MemberID from Member limit 100000,1) limit 100;
100 rows in set (0.02 sec)
mysql> nopager
PAGER set to stdout
mysql> show profilesG
*************************** 1. row ***************************
Query_ID: 1
Duration: 0.00003300
Query: select count(*) from Member
*************************** 2. row ***************************
Query_ID: 2
Duration: 0.00167000
Query: select * from Member limit 10, 100
*************************** 3. row ***************************
Query_ID: 3
Duration: 0.00112400
Duration: 0.00112400
Query: select * from Member where MemberID >= (select MemberID from Member limit 10,1) limit 100
*************************** 4. row ***************************
Query_ID: 4
Duration: 0.00263200
Query: select * from Member limit 1000, 100
*************************** 5. row ***************************
Query_ID: 5
Duration: 0.00134000
Query: select * from Member where MemberID >= (select MemberID from Member limit 1000,1) limit 100
*************************** 6. row ***************************
Query_ID: 6
Duration: 0.09956700
Query: select * from Member limit 100000, 100
*************************** 7. row ***************************
Query_ID: 7
Duration: 0.02447700
Query: select * from Member where MemberID >= (select MemberID from Member limit 100000,1) limit 100
It can be seen from the results that when the offset is more than 1000, using the subquery method can effectively improve the performance.
2. Inverted table optimization method
The inverted table method is similar to building an index, using a table to maintain the number of pages, and then obtaining data through efficient connections
Disadvantages: It is only suitable for a fixed number of data, the data cannot be deleted, and it is difficult to maintain the page table
3. Reverse search optimization method
When the offset exceeds half the number of records, use sorting first, so the offset is reversed
Disadvantages: order by optimization is more troublesome, to increase the index, the index affects the efficiency of data modification, and to know the total number of records
, the offset is greater than half of the data
quote
limit offset algorithm:
Forward lookup: (current page - 1) * page length
Reverse lookup: total records - current page * page length
Do an experiment to see how it performs
Total records: 1,628,775
Records per page: 40
Total pages: 1,628,775 / 40 = 40720
Intermediate pages: 40720 / 2 = 20360
page 21000
Forward lookup SQL:
Sql code
SELECT * FROM `abc` WHERE `BatchID` = 123 LIMIT 839960, 40
Time: 1.8696 seconds
Reverse lookup sql:
Sql code
SELECT * FROM `abc` WHERE `BatchID` = 123 ORDER BY InputDate DESC LIMIT 788775, 40
Time: 1.8336 seconds
page 30000
Forward lookup SQL:
Sql code
1.SELECT * FROM `abc` WHERE `BatchID` = 123 LIMIT 1199960, 40
SELECT * FROM `abc` WHERE `BatchID` = 123 LIMIT 1199960, 40
Time: 2.6493 seconds
Reverse lookup sql:
Sql code
1.SELECT * FROM `abc` WHERE `BatchID` = 123 ORDER BY InputDate DESC LIMIT 428775, 40
SELECT * FROM `abc` WHERE `BatchID` = 123 ORDER BY InputDate DESC LIMIT 428775, 40
Time: 1.0035 seconds
Note that the result of the reverse search is descending desc, and the InputDate is the insertion time of the record. It can also be indexed by the primary key, but it is inconvenient.
4. Limit limit optimization method
Limit the limit offset below a certain number. . Exceeding this number means no data, I remember the dba of alibaba said they did this
5. Just look up the index method