Install SOLR in the production library 07/04 Update SLTechnology News&Howtos

Install SOLR in the production library

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The article mainly comes from the translation of official documents:

There is an install_solr_service.sh script in the archive bin directory to help you install solr as a service. Currently supports CentOS, Debian, Red Hat, SUSE and Ubuntu Linux systems.

To facilitate the upgrade, it is recommended to separate the installation directory from the solr data file directory and create a soft connection for the installation directory (install_solr_service.sh will do all these things for you, you only need to specify the directory, and the-I option can specify the installation directory).

If version 7.7 is used and the default security directory (/ opt) is used, the soft connection method is as follows:

/ opt/solr-7.7.0

/ opt/solr- > / opt/solr-7.7.0

It is not recommended to run solr as a root user, so install_solr_service.sh will create a default user solr for you, and user solr will be the initiator and file owner of the solr system (it can be verified by ps-ef | grep java, and the ls-ls command can also verify the owner of the solr file). The-u option specifies that the user overrides the default solr.

4. Specific installation steps (in fact, installation only needs to perform this step, the rest are instructions for this step)

Decompress: tar xzf solr-7.7.0.tgz solr-7.7.0/bin/install_solr_service.sh-- strip-components=2 (executed with root users)

Installation: sudo bash. / install_solr_service.sh solr-7.7.0.tgz (executed by the root user), this command is equivalent to sudo bash. / install_solr_service.sh solr-7.7.0.tgz-I / opt-d / var/solr-u solr- s solr- p 8983 (- I specifies the SOLR installation directory,-d specifies the SOLR data directory,-u specifies the SOLR system startup and file all users,-p specifies the SOLR startup port)

5. Start and shut down, restart commands, view status

Service solr start service solr stop service solr restart service solr status

6. The solr.sh script under the installation directory bin contains the default startup parameters such as the location of the data directory, or you can pass parameters to override the configuration in the script (such as-Dsolr.solr.home= … You can specify a data directory). However, it is strongly recommended to use the SOLR system configuration file, which is usually / etc/default/solr.in.sh (the directory of this file can be used when executing the installation script install_solr_service.sh, replacing the default path with the-s option), with all the parameters involved.

The configuration file is configured with at least the following:

SOLR_PID_DIR= "/ var/solr"

SOLR_HOME= "/ var/solr/data"

LOG4J_PROPS= "/ var/solr/log4j2.xml"

SOLR_LOGS_DIR= "/ var/solr/logs"-- directory of log files

SOLR_PORT= "8983"

7. Init.d script. Installing solr as a service (that is, installing it with install_solr_service.sh) generates a / etc/init.d/solr file to help you manage the solr service (that is, you can use the servce solr command to operate it. The script has the location information needed to launch, as follows (if you use the default installation):

SOLR_INSTALL_DIR=/opt/solr-the directory where solr is installed

SOLR_ENV=/etc/default/solr.in.sh-configuration file for solr service

RUNAS=solr-specify the operating system user name, that is, the service to which the solr service belongs

)。

8. Error log file: / var/solr/logs/solr.log

9. Tuning part

Dynamic default values for Dynamic Defaults for ConcurrentMergeScheduler concurrent merge Scheduler

Merge Scheduler is configured in solrconfig.xml (each core has this configuration file, / var/solr/data/test_core/conf/solrconfig.xml). Merge Scheduler starts multiple backstage threads to merge Lucene segments (Lucene segments).

By default, ConcurrentMergeScheduler. The type of hard drive is automatically detected:

If it is a mechanical hard drive (rotational disk): maxThreadCount=1 maxMergeCount=6

If it is a solid state drive (SSD): half of the number of maxThreadCount=4 or CPU (refers to the CPU assigned to JVM, which is the highest value). MaxMergeCount=maxThreadCount+5

In the LINUX system, the system will automatically detect the hard disk type, even so, there is no guarantee that the complete detection is correct. Other systems are treated as mechanical hard drives. Therefore, the default values of these two values may be problematic. And these two values have a great impact on performance, so it's best to fix them manually (page 73 of the official document).

The data automatically detected by the system can be obtained through Metrics API (solr.node:CONTAINER.fs.coreRoot.spins if true indicates a mechanical hard disk).

Suggestion: (1) it is best to specify maxThreadCount and maxMergeCount values in the solrconfig.xml configuration file according to the hard drive you use.

For example:

9-the bank needs to be added in solrconfig.xml

4-the bank needs to be added in solrconfig.xml

(2)

Alternatively, the boolean system property lucene.cms.override_spins can be set in the SOLR_OPTS

Variable in the include file to override the auto-detected value. Similarily, the system property

Lucene.cms.override_core_count can be set to the number of CPU cores to override the auto-detected

Processor count.

Note: optional method, it is said here that you can manually specify the number of CPU and hard disk type in the / etc/default/solr.in.sh file. The approximate addition is as follows (because it has not been found or tried, it is recommended to use the above method):

SOLR_OPTS= "$SOLR_OPTS-Dsolr.clustering.enabled=true"-existing contents of the file

SOLR_OPTS= "$SOLR_OPTS-Dsolr.lucene.cms.override_spins=true"-add it yourself

SOLR_OPTS= "$SOLR_OPTS-Dsolr.lucene.cms.override_core_count=2"-add it yourself

Memory settings and garbage collection (Memory and GC Settings)-Garbage Collection garbage collection

Set in the / etc/default/solr.in.sh file (default is 512m)

SOLR_JAVA_MEM= "- Xms512m-Xmx512m"-default content, which can be modified according to the actual situation, such as 10G: SOLR_JAVA_MEM= "- Xms10g-Xmx10g".

SOLR comes with a set of configuration parameters for JAVA garbage collection, which is reasonable for many cases, but for some special SOLR cases, you may need to adjust these parameters, so you need to modify the GC_TUNE variable in the / etc/default/solr.in.sh file. (JVM configuration garbage collection can ask other personnel)

When an OutOfMemoryError exception is thrown, JVM calls / opt/solr/bin/oom_solr.sh, and the script issues the kill-9 command to kill the solr process. This is recommended in SolrCloud mode mode. You can take a look at the contents of the script to see what happens when an exception occurs in JVM.

General principles for setting JAVA JVM parameters:

-Xms specifies the heap size to initialize-Xmx specifies the maximum memory size of the heap. When the required memory exceeds the initial memory size, the heap size slowly expands automatically.

According to the actual memory used in the application, it is reasonable to increase the initial memory parameters. Because the initial size is large, it will only affect the startup speed (slow to initialize memory at startup), but you can avoid later heap expansion, because the program needs to expand again, which will cause the application to wait.

The setting of-Xmx is more critical, because the memory used by the program exceeds this value, object creation may cause an error and throw an OutOfMemoryException. Too large a setting can also be disadvantageous.

When the memory usage reaches the maximum parameter, the garbage collection mechanism reclaims the free memory and throws an exception to the application only if the recycling attempt fails. As long as the memory parameters are set large enough, the application will run without error. If memory garbage collection is frequently enforced, the system will run very slowly (As long as the maximum is big enough, your app will run without error, but it may run more slowly if forced garbage collection kicks in frequently).

The larger the memory heap, the longer it takes for memory garbage collection. To make matters worse, there will be random pauses, and to make matters worse, the system sometimes pauses for a minute or more. -JAVA developers say that JVM reclaims memory and automatically reclaims memory at regular intervals

When the memory heap uses more than 2G (tow gigabytes), it will become a problem, even if the operating system still has a lot of memory available.

If operating system resources are available, it is generally recommended to open multiple JVMS rather than a single JVM using a lot of memory (of course, some JVM vendors may have customized specific recycling mechanisms to handle large memory heaps).

Don't let JVM use all the memory available to the operating system, because the operating system will need to cache file handles or do other work. Try to avoid the operating system swapping memory space to disk, which can greatly affect performance.

For SOLR with frequent read and write, leave enough memory for the operating system. This configuration critical value, it is best to go through many actual experiments. (1308 pages)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.