How to perform manual secondary Sharding on SolrCloud cluster Collection 07/16 Update SLTechnology News&Howtos

How to perform manual secondary Sharding on SolrCloud cluster Collection

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to manually Sharding the SolrCloud cluster Collection". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to manually re-Sharding the SolrCloud cluster Collection.

A search server cluster is built based on SolrCloud 4.3.1+Tomcat 7. A Collection corresponds to three shards (Shard) on three nodes and contains a copy of the corresponding shard (Replica). At this time, the Collection has a total of about 60 million Document, with an average of about 20 million per shard.

The specific distribution of SolrCloud cluster nodes, as shown in the figure:

Only shard1 has a copy and is on a different node.

With the growth of the amount of index data, if each shard of our Collection continues to grow, resulting in a single shard in the search, the corresponding speed becomes a bottleneck, then we should consider slicing each shard again. Because the number of shards has been set at the time of the first system planning, the number of Document contained in each shard is almost the same, that is to say, after the second shard, the number of shards is twice as many as the original.

Currently, SolrCloud does not support automatic shard, but it does support manual shard, and the number of Document contained in the new shard obtained after manual shard is different (it is not clear whether SolrCloud supports manual shard to roughly share a shard). Next, let's take a look at what needs to be done during manual sharding and how to replan the entire SolrCloud cluster.

First of all, I added a node (slave6 10.95.3.67) to copy the original configuration file, solr-cloud.war and its Tomcat server from the cluster to this new node, in order to fragment the shard1 on 10.95.3.62 again, and then run the shard again on the new 10.95.3.67 node. Start the Tomcat server of the new node, which automatically connects to the ZooKeeper cluster. In this case, the ZooKeeper cluster increases the number of live_nodes, mainly by adding the following content to the startup script of Tomcat:

JAVA_OPTS= "- server-Xmx4096m-Xms1024m-verbose:gc-Xloggc:solr_gc.log-Dsolr.solr.home=/home/hadoop/applications/solr/cloud/multicore-DzkHost=master:2188,slave1:2188,slave4:2188"

In this way, you can tell the ZooKeeper cluster that there are new nodes joining the SolrCloud cluster.

As shown in the figure above, we plan to slice the shard1 manually for two times, and execute the following command:

Curl 'http://master:8888/solr-cloud/admin/collections?action=SPLITSHARD&collection=mycollection&shard=shard1'

This process takes a long time and may be accompanied by the following abnormal corresponding information:

[html] view plaincopy

500300138splitshard the collection time out:300sorg.apache.solr.common.SolrException: splitshard the collection time out:300s

At org.apache.solr.handler.admin.CollectionsHandler.handleResponse (CollectionsHandler.java:166)

At org.apache.solr.handler.admin.CollectionsHandler.handleSplitShardAction (CollectionsHandler.java:300)

At org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody (CollectionsHandler.java:136)

At org.apache.solr.handler.RequestHandlerBase.handleRequest (RequestHandlerBase.java:135)

At org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest (SolrDispatchFilter.java:608)

At org.apache.solr.servlet.SolrDispatchFilter.doFilter (SolrDispatchFilter.java:215)

At org.apache.solr.servlet.SolrDispatchFilter.doFilter (SolrDispatchFilter.java:155)

At org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:243)

At org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:210)

At org.apache.catalina.core.StandardWrapperValve.invoke (StandardWrapperValve.java:222)

At org.apache.catalina.core.StandardContextValve.invoke (StandardContextValve.java:123)

At org.apache.catalina.core.StandardHostValve.invoke (StandardHostValve.java:171)

At org.apache.catalina.valves.ErrorReportValve.invoke (ErrorReportValve.java:99)

At org.apache.catalina.valves.AccessLogValve.invoke (AccessLogValve.java:953)

At org.apache.catalina.core.StandardEngineValve.invoke (StandardEngineValve.java:118)

At org.apache.catalina.connector.CoyoteAdapter.service (CoyoteAdapter.java:408)

At org.apache.coyote.http11.AbstractHttp11Processor.process (AbstractHttp11Processor.java:1023)

At org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process (AbstractProtocol.java:589)

At org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run (JIoEndpoint.java:310)

At java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145)

At java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:615)

At java.lang.Thread.run (Thread.java:722)

five hundred

In this version of Solr, the actual execution of manual sharding has already been done in the SolrCloud cluster, and this exception can be ignored. What we need to note is that when manual sharding is carried out, try not to do it when the cluster operation is relatively frequent. For example, I do manual sharding while keeping 10 threads indexing at the same time. Observe the status of the server, the resource consumption is relatively large, and the speed is very slow.

In the process of performing manual sharding, we can manage the page through Web. Observe the state changes of the nodes in the current cluster.

After submitting the above command, you can see the status of the new nodes in the cluster, as shown in the figure:

The above status is "Recovering", that is, the shard1 is divided into two sub-shards, and the new node is added to the cluster to receive the shard (or the corresponding replica). As can be seen in the figure above, shard3 and shard1 have each added a replica to the new node.

Continue to see the cluster status change, as shown in the figure:

On the node where the shard1 is located (10.95.3.62), the shard1 is divided into two sub-shards: shard1_0 and shard1_1. At this time, three shards are in the "Active" state on the 10.95.3.62 node. In fact, so far, sub-shards shard1_0 and shard1_1 have completely taken over shard1 shards, but they have not automatically withdrawn from the figure, which requires us to manually "unload" the unwanted shard on the management page.

At this point, the two new sub-shards do not deal with the previous two copies of shard1, and they also need to shard (actually re-copy the new shard), as shown in the figure:

After waiting for the "Recovering" recovery to complete, we can see the node graph entering the "Active" state, as shown in the figure:

Manual sharding has basically been completed. At this time, if you continue to index new data, shard1 and its copy will no longer receive requests, so the data is already on the sub-shards of the second shard, and the request will also be sent to the nodes of those sub-shards. The following needs to unload the previous shard1 and its shards, that is, to exit the cluster. The shards to be processed mainly include the following:

Mycollection_shard1_replica1

Mycollection_shard1_replica_2

Mycollection_shard1_replica_3

There must be no operational errors, otherwise the index data may be lost. After the unload shards, the nodes of the new cluster are distributed, as shown in the figure:

Shard1_0 and shard1_1 are two new shards, and the corresponding copies are as follows:

Mycollection_shard1_0_replica1

Mycollection_shard1_0_replica2

Mycollection_shard1_0_replica3

Mycollection_shard1_1_replica1

Mycollection_shard1_1_replica2

Mycollection_shard1_1_replica3

Next, let's compare the number of Document on each node after manual second sharding, as shown in the following table:

Number of node documents where the shard / copy name is located mycollection_shard1_0_replica110.95.3.6218839290mycollection_shard1_0_replica210.95.3.6718839290mycollection_shard1_0_replica310.95.3.6118839290mycollection_shard1_1_replica110.95.3.62957980mycollection_shard1_1_replica210.95.3.61957980mycollection_shard1_1_replica310.95.3.67957980mycollection_shard2_replica110.95.3.6223719916mycollection_shard3_replica110.95.3.6123719739mycollection_shard3_replica110.95.3.6723719739

It can be seen that the number of Document on the shard1_1 of the second slice is very uneven compared with other fragments.

SolrCloud is also being updated constantly, and sharding may be more taken into account in subsequent versions. In addition, if the fragmentation on a node is too large, which affects the search efficiency, we can consider another scheme, which is to rebuild the index, even if the new nodes are added, the re-index will be re-sliced again and evenly distributed to each node.

At this point, I believe you have a deeper understanding of "how to manually re-Sharding the SolrCloud cluster Collection". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.