Road ElasticSearch learning -day08

This article reprinted from: https://www.elastic.co/guide/cn/elasticsearch/guide/current/index.html , ES version 6.3.0

Routing the document to fragmentation
when you index a document, which is stored on a separate master slice. How Elasticsearch know which documents belong to slice it? When you create a new document, how it is to know should be stored in slices 1 or 2 slices on it?
The process is not random, because we want to retrieve documents in the future. In fact, it is based on a simple algorithm decision:

shard = hash(routing) % number_of_primary_shards

routing value is an arbitrary string, which is the default, but can also be customized _id . This generates a digital string routing through the hash function, and then dividing by the number of the main sections of a remainder obtained (REMAINDER), the remainder is always the range 0 to number_of_primary_shards - 1, this figure is a fragment of a particular document is located. This also explains why the number of primary fragmentation can only be defined and can not be modified when creating an index: If the number of primary fragmentation change in the future, all the previous routing value becomes ineffective, the document will never find
all of the document API (get, index, delete, bulk, update, mget) receives a routing parameter, it uses the mapping from a document to define the slices. Custom routing values can ensure that all relevant documents - for example, belong to the same individual documents - are stored in the same slice. We will "expand" section explains why you need to do.

notes: Sometimes users think of expansion after a fixed number of primary fragments will become very difficult. In reality, some technologies will make expansion easier when you need it. We will discuss in the "Extensions" section.

Main fragmentation and fragmentation copy how they interact
in order to demonstrate the intent, we assume that there are three nodes of the cluster. It contains an index called bblogs and has two main fragments. Each slice has two main fragment replication. Same fragmentation is not on the same node, so we cluster like this:

We can send a request to any node in the cluster. Each node has the ability to handle any request. Each node knows where any node in the document, so it can forward the request to the desired node. The following example, we send the request to Node1, the node we will call requesting node.

notes: When we send the request, the best practice is circulated through all nodes requests, which can balance the load.

New, document indexing and delete
the new index, and delete requests are written (write) operation, they must successfully complete in order to be copied to the relevant replication on the master slice slices.

Below we listed on the main success of the new fragmentation and fragmentation copy, or delete a document index necessary sequential steps of:

  • 1. The client to send a new Node1, index, or delete request.
  • 2. Use the document _id node determines that the document belongs to slice 0. He forwards the request to Node3, the slice 0 is located on the node
  • 3.Node3 execution request on the master slice, if successful, it forwards the request to the appropriate node is located on the replication of Node1 and Node2. When all copied nodes report success, Node3 reports success to the requesting node; node requests to the client in the report.

The client receives a successful response time, modify the document has been applied to the primary fragmentation and copy all of the slices. Your changes take effect.
There are many optional request parameter allows you to change the process. You may want to sacrifice some security to improve performance. This option is rarely used because Elasticsearch fast enough, but in order to complete the content we will do some elaboration.
replication (copying)
the default value of sync. This will cause the primary copy fragments obtained after fragmentation success response is returned.
If you set replication to async, a request on the master slice is returned to the client after being executed. It still will forward the request to copy nodes, but you will not know whether or not successfully replicate sites. Above this option is not recommended. The default sync copy permission Elasticsearch force feedback transmission. async because of replication may be sending too many requests without waiting for the other fragments ready Elasticsearch overload.
Consistency (Consistency)
The default master slice tries to write a predetermined required number (Quorum) half or fragment (which may be the master node or nodes replicate) available. This is to prevent data being written to the wrong network partition. A predetermined number is calculated as follows:
int ((Primary number_of_replicas +) / 2) +. 1
Consistency values are allowed one (only one primary slice), all (all master slices and slice replication), or a default or through quorum half-slicing.
Note number_of_replicas is provided in the index, and to define the number of copies of the fragments, instead of the number of active nodes are now copied. If you define an index with three replication nodes, number of provisions that are:
int ((Primary Replicas + 3) / 2) + 1 = 3
But if you have only two nodes, then your activity is not specified the number of slices, it can not index or delete any documents.
timeout (timeout)
What happens when there is insufficient fragmentation copy? Elasticsearch will wait for more fragments appear. The default Wait a minute. If necessary, you can set the timeout parameters to make it earlier termination: 100 represents 100 milliseconds, 30s represent 30 seconds.

note: there is a new index default replication fragmented, which means that in order to meet the requirements of quorum requires two active fragments. Of course, this default setting will prevent us to operate in a single node in the cluster. To avoid this problem, a number of provisions only take effect if greater than number_of_replicas.

Retrieve the document
file can be any fragment or a copy from a master slice is retrieved.

Below we list the main and retrieve a copy sheet or document on the order of the steps necessary fragment:

  • 1. The client transmits a get request Node1
  • 2. Use the document _id node determines that the document belongs to slice 0. Slice 0 has a corresponding slice in the copy three nodes. In this case it forwards the request to Node2.
  • 3.Node2 return the document to Node1 then returned to the client, for too many requests, in order to balance the load, the requesting node will request different options for each slice - he will be circulating copies of all fragments, may be the case, a the indexed document already exists in the primary on-chip division has not had time to sync to replicate fragments. Then copy the fragmentations will report document is not found, the main fragment will be successful return to the document. Once the index is returned to the user request is successful, the document and the copy sheet in the main and fragmentation are available.

Guess you like

Origin blog.csdn.net/qq_23536449/article/details/90898718