1. Default port allocation when SolrCloud is started inline
When Solr runs the embedded zookeeper service, the solr port + 1000 is used as the client port by default. In addition, the solr port + 1 is used as the zookeeper service port, and the solr port + 2 is used as the main service election port. So in the first example, Solr is running on port 8983, and the embedded zookeeper uses 9983 as the client port and 9984 and 9985 as the service port.
clientPort=9983 server.1=192.168.238.133:9984 :9985 These ports are the ports in the corresponding configuration
For example of SolrCloud inline start click: http://wiki.apache.org/solr/SolrCloud
2. Manage the cluster through the cluster api (Core Admin)
1). Create an interface (the first automatic allocation)
http://192.168.66.128:8081/solr/admin/collections?action=CREATE&name=collection1&numShards=3&replicationFactor=2&maxShardsPerNode=2&collection.configName=myconf http://192.168.66.128:8081/solr/admin/collections?action=CREATE&name=collection1&numShards=3&replicationFactor=2&maxShardsPerNode=2&collection.configName=myconf&createNodeSet=192.168.66.128:8083_solr,192.168.66.128:8081_solr,192.168.66.128:8082_solr
In this way, a collection will come out, which has 3 shards, each shard has 1 data node and 1 backup node, that is, the collection has a total of 6 cores
parameter:
name : The name of the collection to be created
numShard s: The number of logical shards to be created when the collection is created
replicationFactor : The number of replicas of the shard. A replicationFactor of 3 means that each logical shard will have 3 copies.
maxShardsPerNode : The default value is 1, the maximum number of shards on each Solr server node (new in 4.2)
Pay attention to three values: numShards, replicationFactor, liveSolrNode (the current surviving solr node). A normal solrCloud cluster does not allow multiple replicas of the same shard to be deployed on the same liveSolrNode. Therefore, when maxShardsPerNode=1, numShards*replicationFactor>liveSolrNode , an error is reported. Therefore, it is correct because the following conditions are met: numShards*replicationFactor<liveSolrNode*maxShardsPerNode
createNodeSet : If this parameter is not provided, the core will be created on all active nodes, and if this parameter is provided, the core will be created on the specified solr node
For example, I am now creating 3 slices and 1 copy on 5 tomcats. If this parameter is not provided, the result is like this
Provide this parameter for example: createNodeSet=192.168.66.128:8083_solr,192.168.66.128:8081_solr,192.168.66.128:8082_solr
The result is this
collection.configName : The name of the configuration file to use for the new collection. If this parameter is not provided, the collection name will be used as the name of the configuration file.
Create an instance of interface 2 (manual allocation): It is recommended to use the following multiple links (3 shards, one backup on each node), because in this way you can create as many times as you want
http://192.168.66.128:8081/solr/admin/cores?action=CREATE&name=shard1_replica1&instanceDir=shard1_replica1&dataDir=data&collection=collection1&shard=shard1&collection.configName=myconf http://192.168.66.128:8082/solr/admin/cores?action=CREATE&name=shard1_replica2&instanceDir=shard1_replica2&dataDir=data&collection=collection1&shard=shard1&collection.configName=myconf http://192.168.66.128:8082/solr/admin/cores?action=CREATE&name=shard2_replica1&instanceDir=shard2_replica1&dataDir=data&collection=collection1&shard=shard2&collection.configName=myconf http://192.168.66.128:8083/solr/admin/cores?action=CREATE&name=shard2_replica2&instanceDir=shard2_replica2&dataDir=data&collection=collection1&shard=shard2&collection.configName=myconf http://192.168.66.128:8083/solr/admin/cores?action=CREATE&name=shard3_replica1&instanceDir=shard3_replica1&dataDir=data&collection=collection1&shard=shard3&collection.configName=myconf http://192.168.66.128:8081/solr/admin/cores?action=CREATE&name=shard3_replica2&instanceDir=shard3_replica2&dataDir=data&collection=collection1&shard=shard3&collection.configName=myconf
Parameter meaning:
name : the name of the new core
The naming rules of the created core:
coreName_shardName_replicaN
For example: create a collection of pscp, 2 shards, each with 2 backups
are named as follows:
pscp_shard1_replica1
pscp_shard1_replica2
pscp_shard2_replica1
pscp_shard2_replica2
shard : Specify an allocation id, the core will be attached to that shard (write whatever you want, if you don't have this id, it will help you create it for the first time)
collection.configName : Specify a configuration file from zookeeper
instanceDir and dataDir : see his meaning from the following figure
Naming rules: instanceDir has the same name as name, dataDir: it is recommended to be named data
Summary 1: Two ways to add a replica to a cluster
http://192.168.66.128:8081/solr/admin/collections?action=ADDREPLICA&collection=collection1&shard=shard2&node=192.168.66.128:8085_solr The meaning of the above sentence is to add a copy to the shard2 fragment of the collection1, and the address of the copy is on the 192.168.66.128:8085_solr machine. http://192.168.66.128:8083/solr/admin/cores?action=CREATE&name=shard3_replica1&instanceDir=shard3_replica1&dataDir=data&collection=collection1&shard=shard3&collection.configName=myconf
2). Delete the interface
http://localhost:8983/solr/admin/collections?action=DELETE&name=mycollection
parameter:
name:将被创建的集合别名的名字
collections:逗号分隔的一个或多个集合别名的列表
3). Reload the interface. At this time, the corresponding core will reload the configuration file
http://localhost:8983/solr/admin/collections?action=RELOAD&name=mycollection
parameter:
name:将被重载的集合的名字
4). Split fragment interface
http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=<collection_name>&shard=shardId
collection : the name of the collection
shard:将被分割的碎片 ID
This command cannot be used on clusters that use custom hashes, because such clusters do not have an explicit hash range. It is only used for clusters with plain or compositeid routes. This command will split the shard corresponding to the given shard index into two new shards. By dividing the shard range into two equal partitions and splitting its documents in the parent shard (the shard being divided) according to the new shard range. The new shards will be named appending_0 and _1. For example: shard=shard1 is split, the new shards will be named shard1_0 and shard1_1. Once new shards are created, they are activated while the parent shard (forked shard) is suspended so there will be no new requests to the parent (forked shard). This feature meets the requirements for seamless segmentation and no downtime. The original fragmented data will not be deleted. Use the new API command to overload shards at the user's discretion. This feature was released from Solr 4.3. Since some bugs were found in the 4.3 release, it is recommended to wait for 4.3.1 to use this feature.
3. Upload files to Zookeeper for management through command line tools
The reason why it can be distributed is because ZooKeeper is introduced to save the configuration files uniformly. Therefore, it is necessary to upload the configuration file of SolrCloud to ZooKeeper. Here, the command line is shown for uploading.
To use the command line management tool, you must first have packages. These packages are all jar packages under /WEB-INF/lib in solr.war
Step 1: Create a new folder
On any machine that can communicate with the Zookeeper cluster, create two new folders, for example, the following is my directory
/usr/solrCloud/conf/files /usr/solrCloud/conf/lib
files: used to save configuration files lib: used to store jar packages
Step 2: Upload the jar and configuration files you need to use
Upload the jar to the lib directory, and put the jars under the solr distribution package (solr-4.8.0\example\solr-webapp\webapp\WEB-INF\lib\ and solr-4.8.0\example\lib\ext\ The following packages are all To) upload all to the lib directory above
Upload the solr configuration file to the above files directory
Step 3: Upload files to Zookeeper for unified management
java -classpath .:/usr/solrCloud/conf/lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 192.168.27.18:2181,192.168.27.18:2182,192.168.27.18:2183 -confdir /usr/solrCloud/conf/files -confname myconf
-cmd upconfig: upload configuration file
-confdir: directory for configuration files
-confname: specify the corresponding name
Check if the file has been uploaded to the Zookeeper server:
sh zkCli.sh -server localhost:2181 ls /configs/myconf
Step 4: Associate the configuration file uploaded to ZooKeeper with the collection
java -classpath .:/usr/solrCloud/conf/lib/* org.apache.solr.cloud.ZkCLI -cmd linkconfig -collection collection1 -confname myconf -zkhost 192.168.27.18:2181,192.168.27.18:2182,192.168.27.18:2183
-cmd linkconfig: "bind" the configuration file for the specified collection
-collection: The name of the collection specified above
-confname: the name of the configuration file on zookeeper
The meaning of the above code is: the created core (collection1) will use the myconf configuration file
For example: Executing the following request will create a core as collection1, then the configuration file he uses is the configuration of myconf in zookeeper
http://localhost:8983/solr/admin/collections?action=CREATE&name=collection1&numShards=3&replicationFactor=1
Then again, if there is only one configuration on the cluster managed by zookeeper, the created core will use this default configuration. If there are multiple copies, if the fourth step is not performed, creating a core will throw an exception and the build will fail!
For example execute:
http://192.168.66.128:8081/solr/admin/collections?action=CREATE&name=sdf&numShards=3&replicationFactor=1
Will throw: Because there are two configurations above, but the fourth step is not performed, associating the configuration with the core (name=sdf) to be created
<response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">16563</int> </lst> <lst name="failure"> <str> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'sdf_shard2_replica1': Unable to create core: sdf_shard2_replica1 Caused by: Could not find configName for collection sdf found:[conf1, myconf] </str> <str> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'sdf_shard1_replica1': Unable to create core: sdf_shard1_replica1 Caused by: Could not find configName for collection sdf found:[conf1, myconf] </str> <str> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'sdf_shard3_replica1': Unable to create core: sdf_shard3_replica1 Caused by: Could not find configName for collection sdf found:[conf1, myconf] </str> </lst> </response>
Of course, the fourth step can also be replaced with the following, and the following one is more flexible and recommended (with this step, the fourth step can be completely omitted )
http://192.168.66.128:8081/solr/admin/collections?action=CREATE&name=conf2&numShards=3&replicationFactor=1&collection.configName=myconf collection.configName=myconf: Specify a configuration in zookeeper for the created core
The document is written here, let's see how to modify and delete files uploaded to zookeeper:
The common practice of modification is: re-upload, re-upload will overwrite the above files, so as to achieve the purpose of modification
The way to delete files or directories in zookeeper is as follows:
[zk: 192.168.66.128:2181(CONNECTED) 7] delete /configs/conf1/schema.xml [zk: 192.168.66.128:2181(CONNECTED) 10] ls /configs/conf1 [solrconfig.xml] [zk: 192.168.66.128:2181(CONNECTED) 11]
Upload the configuration to zookeeper. If you want the running solr to load these files synchronously, you only need to let solr reload the configuration file and enter in the browser
http://192.168.27.18:8081/solr/admin/collections?action=RELOAD&name=collection1
references:
How to manage the collection official website of the entire cluster through api
https://cwiki.apache.org/confluence/display/solr/Collections+API
Manage solr core official website through api
http://wiki.apache.org/solr/CoreAdmin
SolrCloud deployment official website on tomcat
http://wiki.apache.org/solr/SolrCloudTomcat
Solr deploys the official website on tomcat
http://wiki.apache.org/solr/SolrTomcat
Blogs worth referring to:
http://blog.csdn.net/xyls12345/article/details/27504965
http://myjeeva.com/solrcloud-cluster-single-collection-deployment.html#deploying-solrcloud
http://blog.csdn.net/woshiwanxin102213/article/details/18793271
http://blog.csdn.net/natureice/article/details/9109351
Solrcloud name explanation
http://www.solr.cc/blog/?p=99
solr.xml explained
http://www.abyssss.com/?p=415