elasticsearch Discovery discovery learning module

And clusters formed discovery module
aims
  • Node discovery
  • Master election
  • Clustered, to update the information in the Master changed.
  • Fault detection
Broken down into several sub-modules
Discovery discovery module

Discover Master node when the cluster is unknown, we found each other in the process, such as adding new nodes or the previous master node goes down, if a node does not meet the Master qualification, it will continue to find until you find the main selected Until node, wherein the retry configuration attributes: discovery.find_peers_interval, default 1s.

Meaning official online master-eligible: Set the node-master: true nodes, the nodes represent qualify as a Master's.

First, unicast-based way to find out

You can use discovery.zen.ping.unicast.hosts in elasticsearch.yml static configuration file settings host list.
discovery.zen.ping.unicast.hosts: [ "host1", " host2"]
specific value is a host or a comma delimited string array. Each value should host: port or in the form of host (which is the default port settings transport.profiles.default.port

Obsolete configuration Configuration (New)
discovery.zen.ping.unicast.hosts discovery.seed_hosts
discovery.zen.hosts_provider discovery.seed_providers
discovery.zen.no_master_block cluster.no_master_block

Second, based on the profile way to find out

elasticsearch seed can be configured in a file list of hosts to the nodes found in this way in a container environment can be very good support dynamic extensions, you can change the contents of the file at any time, without restarting node.
Configuration file for the host ip, host ip: port can also be used to configure a host name, which will trigger a DNS lookup, each waiting time property in the DNS lookup: discovery.zen.ping.unicast.resolve_timeout, the default is 5S, If you do not specify a port, the default search order transport.profiles.default.port, transport.port.
Note: If you have configured the discovery.seed_hosts, will merge the two configurations.

election

First, the election Master
election Master Master candidates need all the nodes to work together, even if some node fails, the job must be able to carry out normal, es need to select a node by means of arbitration can work, and then make up the cluster, avoiding the formation of "split brain", where "split brain" means more than one Master node may occur, such as after disconnection between the communication nodes, each node has Master may consider additional candidate nodes are down, to enhance their Master, resulting in an inconsistent state cluster case. Thus derived from the number of candidate nodes be selected as the master, it can be configured to communicate. discovery.zen.minimum_master_nodes, a default configuration is 1. The basic principle is required to be provided here N / 2 + 1, N is the number of nodes in the cluster.

We can know by the above analysis, whether the elections take place, the perception is that the nodes communicate with one another, network communication between nodes is also important to be seen, like the API interface calls, there will have to call a timeout, the difference in network environment under the circumstances, timeout configuration is very important. discovery.zen.ping.timeout to specify the communication timeout between two nodes, the default is 3S. According to the network, adjusting this parameter, try to avoid delays due to network, bring unnecessary election.

Second, change the status of the cluster

Voting Configuration

In the version elasticsearch7, when there is half the candidate master node goes down, the cluster will not be automatically restored in the rest of this extreme case, the easiest solution is to let these nodes back online,
in three cluster nodes, usually can tolerate downtime of a node. After nodes join or leave the cluster, Elasticsearch will automatically configure vote by appropriate changes to respond to ensure that the cluster resilient as possible. Configuration is as follows:

# 将节点加入投票配置排除列表中
# 默认超时时间30s,可以指定超时时间
POST /_cluster/voting_config_exclusions/node_name?timeout=1m
Cluster Startup Items

A cluster bootstrap
first boot Elasticsearch cluster in the cluster needs one or more Master candidate initial nodes explicitly defined a set of master nodes eligible for this behavior is called bootstrap cluster.
The initial set of nodes that meet the requirements in the host cluster.initial_master_nodes arrangement, the following:

节点的节点名称。
该节点的主机名,如果node.name没有设置,因为node.name默认为节点的主机名. 根据系统配置,必须使用标准主机名或裸机主机名.
节点的发布地址的IP地址(如果无法使用该节点的node.name 。这是network.host解析到的IP地址,但是可以覆盖此IP地址。
节点发布地址的IP地址和端口,格式为IP:PORT ,如果不可能使用节点的node.name ,并且有多个节点共享一个IP地址

Note: When starting the Master candidate nodes, may be provided on the command line or elasticsearch.yml file cluster formation in this setting, this setting is no longer needed, and will ignore it, that is to say, this property is just starting for the first time in the cluster. when is useful. And you may need not be provided on a non-Master candidate node.
Special caution is arranged for the best candidate node using Master persistent manner instead of using a command line CMD way to start, because if Master restarted once the candidate node specifies the error, it is possible to form two different clusters. This is likely to bring data loss.

file

By setting cluster.name can create a plurality of clusters separated from each other. When the first node connected to one another which will verify whether they agree to the cluster name, and only the Elasticsearch cluster of nodes with the same cluster name. The cluster name is the default value elasticsearch, but it is recommended to change this value to reflect the logical name of the cluster.

Add OR delete nodes

Because you can dynamically add drop when elasticsearch cluster nodes, that in this process, we can understand what it takes to reach or operate it. During the master election or when added to an existing cluster has formed, the master server node sends a join request, a formal order to add it to the cluster may be used to configure the settings cluster.join.timeout node sending a join request to the cluster after how long to wait. the default value is 30s.

When you delete a node in line with a host of qualifications, it is important not to delete too many nodes. For example, if the current node has seven Master candidate, hoping to reduce it to three, you can not simply stop once four nodes: Doing so will only three nodes, which is less than half of the voting configuration, which means the cluster can not take any further action as long as there are at least three nodes in the cluster meet the conditions of the master, usually, the best time to delete a node, so as to allow sufficient time cluster to automatically adjust and adapt to the new voting failed node configuration set the tolerance level.
Here, we need to pay attention, be forced offline on a node, we need to focus on preventing "split brain" configuration by calling Elasticsearch APi way to configure the persistent down, without restarting node.

curl  -uelastic:passwd -XGET "EsIP:9200/_cluster/settings" -H "Content-Type:application/json" -d '
{
    "persistent" : {
        "discovery.zen.minimum_master_nodes" : 2
    }
}
'
Release Status cluster

Only Master can change the status of the cluster nodes. After the change of status will be updated to publish all the nodes in the cluster, each node will accept this message and confirm Ack. But it will not apply this update. The master node needs to
obtain a discovery.zen.minimum_master_nodes Ack response time discovery.zen.commit_timeout configuration, the state is considered a successful release, otherwise this release is a failure, will not be applied.
For those who have not received the confirmation node is called lag, because their cluster state has lagged behind the latest state of the primary server host node waits to catch up again some time lag, by cluster.follower_lag.timeout, the default is 90s. If the node During this time unsuccessful application clusters status updates, is considered to have failed and deleted node from the cluster.

After Master confirmation Ack number of satisfied, will continue to send a confirmation message to all the nodes, then the node will really applied this cluster status information, which is the process by discovery.zen.publish_timeout second configuration, the default is 30s, the long wait for the timeout is the second release from the beginning of the calculation.

, The number of nodes can be configured by, when you publish a cluster state, get Master candidate nodes Ack is important above by the discovery.zen.minimum_master_nodes. And when there is no master node configuration also need to understand that it is: discovery.zen.no_master_block.
discovery.zen.no_master_block set when the master node does not restrict the operation of the cluster.
all. Representative unavailable for all operations, including calls to the reading and writing all api.
write. This is the default value, only the write operation is rejected, also need to pay attention, this property is related to Node level api is not valid.

Cluster troubleshooting

Elected master node periodically checks each node in the cluster to ensure that they are still connected to each node and running in good condition. Cluster also regularly check the health of the host elected. These checks are called follower checks and leader checks.
Beginning configuration cluster.fault, change the default settings may cause the cluster becomes unstable and proposed changes.

Configure discovery and cluster formation

Here are a few important configuration necessary, find other configuration module has been organized into mind maps, notes [Man] dream of developing public reply number, [found] for a complete picture.

  • discovery.seed_hosts
    provides a list of the cluster nodes that meet the requirements of the host format of each value of host:. port or host, which is the default port settings transport.profiles.default.port.

  • discovery.seed_providers
    the way the document provides a list of hosts, can be modified dynamically, without restarting node (container environment applicable)

  • cluster.initial_master_nodes
    set the initial set of the new cluster meet the requirements of the host node. By default, the list is empty, it means that the node wishing to join the cluster has been booted

  • discovery.find_peers_interval
    selected master node discovery time interval, the default 1S

Guess you like

Origin www.cnblogs.com/hyq0823/p/11569606.html