The first chapter SQL optimization data storage and indexing

Shenkaoziliao:
This series of blog mainly Cenkaoziliao have CUUG database Rannai Gang teacher teaching notes, "SQL optimization core idea" (Luobing Sen, Huang Chao Zhong Jiao with), "PostgreSQL Inside: query optimization Depth" (Zhang Shujie a), ranking in no particular order.

 

1 forewarned

We do SQL optimization, which is the most straightforward tool for indexing, of course, also a double-edged sword, since it talked about the index, we will talk about how to store data in tables, and only then will we put the basic index principle, why build an index, a clear understanding of how to use the index, but also prepare the ground for the later table joins.

Data storage 2

Oracle has a complete set of data stored in the physical storage structure and logical storage structures, we have not much to say here, if you have needs, we open another topic to explain. We are here mainly to seize the three points.

Storage (1) data is disordered, the reason is very simple, time-consuming and resource stored in the order, is not conducive to storage optimization, for example, the thermal block (do not understand it over the bar, we understood to be temporary, this is the DBA for tube, DBA No wonder ...);

(2) data block from disk into memory by the read is not read by the bar, the simple reason that the block size is fixed, fast read, of course, sometimes read a single block, with a plurality of read times, we will combine the analysis behind the index;

(3) Each database record corresponds to only one constant virtual column called rowid, this is the fastest way to locate data because the data object number rowid = + + block number corresponding to the file number + line number, well, they began installed, we understand rowid is the physical address of record.

To sum up three points, suppose we have a table, how to find the data from which we want to do, can only watch from start to finish, to want out, well that's a full table scan.

So the question is, assuming that a lot this table, perform a number of queries per day, every full table scan, do not say anything else, IO cost is enough to collapse the system ran out. The solution is to say the following, index.

3 Index

3.1 with a simple explanation of a balanced tree index + doubly linked list

We are speaking here only B-tree indexes, bitmap indexes on, you can own online.

The first question, what is the index? Answer briefly, the index structure is a tree structure + doubly linked list. There's little friends do not understand it does not matter, we peel the layers terms.

So we assume that the data is stored, is balanced tree + doubly linked lists, data about search, we have the following analysis

(1) assuming that we are looking for numbers 7, when watching root 5-> 7 is greater than 5-> right branch 8-> 7 less than 8-> 7 left branch, we only use location data three times, in fact, this is the Unique Index scanning;

(2) Suppose we are looking for data between 7 and 9, positioned by the above Embodiment 7 7 found, then the right scanning in the following doubly linked list 10, has been found to be greater than 9, the scanning is stopped, in fact, this is an index range scan ;

(3) index skip scan (skip scan) by not explain this structure, we explain later application according to the index case.

Constructing balanced trees and doubly linked lists, keywords are sorted + binary search,

(1) First, the numbers lined up in a certain order, we here in ascending order

(2) a binary search, to find a value of intermediate partial root to do here is 5, and then find about each segment biasing the first intermediate layer do branch value, here 2 and 8, one on until finished looking.

So the question is, the index is ordered it?

The answer is yes, the first step is to sort it!

Then the index of the pros and cons have come,

(1) benefits, fast positioning, it can not be combined with any newspaper folded in a row 20 times to understand;

(2) harm it, it is obvious, sort, and each time the data changes, we must maintain the index structure and order, which is not difficult to understand why the impact of index additions and deletions to the operation;

(3) index sometimes produce bad influence on the Select operation, too early to say here, we Chapter <3 full table scan and index scan select> do explain.

3.2 index storage structure

Above we just use a balanced binary tree indexing simple to understand, in fact, the index is not a binary tree, if it is a binary tree, the index will be high, that is a lot of levels, layers of positioning, will affect the positioning speed;

There is a data structure for each node is a key-value pair, the key is a field value when the index value is ROWID, table data to enable quick positioning.

This is the index structure.

3.3 Summary

(1) on a data storage structure, we understand disorderly + + ROWID concept read in blocks like;

(2) on the index structure, we understand that the index is a tree structure + doubly linked list node in the memory field value and rowid like, we are doing SQL optimization, not specifically study the index data structures, like understanding, no need to get to the bottom.

Guess you like

Origin blog.csdn.net/songjian1104/article/details/91349914