15. ClustrixDB distribution management data

 

 

Key terms used in this section:

 

In the Relation  - ClustrixDB in each table are called "relationship."

Representation in ClustrixDB, each index is called a "Representation". Table data stored in the "Base Representation", this basic relationship is represented by a table covering all internal key index columns. For keyed by the primary key of the table, "Base Representation" with the data stored in the primary key.

Key Distribution's  -  each Representation uses consistent hashing algorithm to hash all or part of its index. "Distribution key" which columns define the index used to construct the hash. The default distribution index is 1, which means that indicate (index) in the first column will be hashed and became the Representation of key distribution.

Slices  -  ClustrixDB each Representation broken down into smaller, more easily managed sections known as "slices" of. The sheet is then distributed across the cluster to facilitate uniform distribution of query processing.

Replicas  -  ClustrixDB by maintaining multiple copies of each slice, to provide fault tolerance and high availability. Copies distributed throughout the cluster, to optimize performance and to ensure the protection of all data in the event of a node failure.

 

DISTRIBUTE

ClustrixDB hash index to determine a given row or table data (Representation) should be located where the cluster. Column is referred to as the hash "key distribution" of the representation. Each index column need guidance on what should be included distribution key.

By default, the first column of the key distribution using the index, regardless of how many index columns. This applies to all, including the primary key index, including.

DISTRIBUTE = clause may be used to override the default values ​​of a single column, and to define the hash index.

  • Use the default allocation
  1.  More columns to include expanded distribution key index
  2.  Expand distribution key column is the primary key to include
  • Modify Distribution
  1. Modify the distribution - the primary key
  2. Modify Distribution - alternate key

 

Use the default allocation

In the present embodiment, the post_id primary key is hashed value of the data table used for distribution, the distribution is set to the default DISTRIBUTE = 1.

sql> CREATE TABLE user_posts (
     post_id int AUTO_INCREMENT, 
     user_id int,  
     posted_on timestamp, 
     data blob, 
     PRIMARY KEY (`post_id`) /*$ DISTRIBUTE=1 */,
     KEY `user_id_posted_on_idx` (`user_id`,`posted_on`) /*$ DISTRIBUTE=1 */
     );

In some cases, based on a single column of data and index distribution may result in poor or "massive" distribution. To solve this problem, we recommend the most unique (optional) into the first row of the column composite index, or key distribution expanded from a single column into multiple columns. There are two ways to do this.

1. Expand the more columns to include the distribution of key index 

The following example shows a plurality of rows alternate key user_id_posted_on_idx, using two columns in the index column instead of the first distribution.

sql> CREATE TABLE user_posts (
     post_id int AUTO_INCREMENT, 
     user_id int,  
     posted_on timestamp, 
     data blob, 
     PRIMARY KEY (`post_id`) /*$ DISTRIBUTE=1 */,
     KEY `user_id_posted_on_idx` (`user_id`,`posted_on`) /*$ DISTRIBUTE=2 */
     );

 

 

2. Expand the distribution key to the column containing the primary key 

The following example shows another key user_id_posted_on_idx, its distribution is 3. This means that the index will be distributed on its two columns (user_id, posted_on) and the primary key (post_id). If the primary key is a composite key, the profile can be further expanded to include other columns of the primary key.

sql> CREATE TABLE user_posts (
     post_id int AUTO_INCREMENT, 
     user_id int,  
     posted_on timestamp, 
     data blob, 
     PRIMARY KEY (`post_id`) /*$ DISTRIBUTE=1 */,
     KEY `user_id_posted_on_idx` (`user_id`,`posted_on`) /*$ DISTRIBUTE=3 */
     );

Modify Distribution

Modify the distribution - the primary key

To modify the combination of the distribution of the primary key after creating the table, use the following syntax ALTER table:

ALTER TABLE tbl_name  PRIMARY KEY   [DISTRIBUTE = n] 

Primary key distribution count can not exceed the number of columns in the primary key.

Modify Distribution - alternate key

To modify other indexes (non-key) is distributed in the following table is created, specify index_name ALTER table in accordance with the following syntax:

ALTER TABLE tbl_name  [ ,INDEX index_name   [DISTRIBUTE = n]]  [ ,INDEX  index_name   [DISTRIBUTE = n ]] 

The maximum value of the distribution of non-primary key is a combination of the number of spare columns and primary keys.

 

 

 

Guess you like

Origin www.cnblogs.com/yuxiaohao/p/11972058.html