In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Today, the editor will share with you the relevant knowledge points about how to achieve Redis data fragmentation. The content is detailed and the logic is clear. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article. Let's take a look at it.
Introduction of Twemproxy
Twitter's Twemproxy is the most widely used redis cluster service in the market. Because redis is single-threaded, and the official cluster is not very stable and widely used. Twemproxy is a proxy slicing mechanism. Twemproxy, as a proxy, can accept access from multiple programs, forward it to each Redis server in the background according to the routing rules, and then return the same way. This scheme solves the problem of carrying capacity of a single Redis instance very well. Of course, Twemproxy itself is a single point, and you need to use Keepalived to do highly available solutions (or LVS). Through Twemproxy, multiple servers can be used to expand redis services horizontally, which can effectively avoid the problem of single point of failure. Although using Twemproxy requires more hardware resources and some performance loss in redis (twitter test is about 20%), it is quite cost-effective to be able to improve the HA of the whole system. In fact, twemproxy not only implements the redis protocol, but also implements the memcached protocol. What do you mean? In other words, twemproxy can proxy not only redis, but also memcached.
Advantages of Twemproxy:
1) expose an access node to reduce the complexity of the program.
2) automatic deletion of failed nodes is supported. You can set the time for reconnecting the node, and you can set how many times to connect and then delete the node. This method is suitable for cache storage, otherwise Key will be lost.
3) support setting HashTag. You can configure two KEYhash to the same instance through HashTag.
4) A variety of hash algorithms, and you can set the weight of backend instances.
5) reduce the number of direct connections to redis: maintain a long connection with redis. You can set the number of connections between the proxy and each redis in the backend, and automatically fragment to multiple redis instances at the backend.
6) avoid a single point of problem: multiple proxy tiers can be deployed in parallel, and the client automatically selects the available one.
7) High throughput: connection reuse, memory reuse, multiple connection requests to form a redis pipelining to request a unified redis.
Disadvantages of Twemproxy:
1) operations on multiple values are not supported, such as taking the sub-intersection and complement of sets.
2) transaction operations of Redis are not supported.
3) the requested memory will not be released, and all machines should have large memory and need to be restarted regularly, otherwise client connection errors will occur.
4) dynamic addition and deletion of nodes are not supported. You need to restart after modifying the configuration.
5) when changing nodes, the system will not redistribute the existing data, and if it does not write its own script for data migration, part of the key will be lost (key itself exists on a certain redis, but key is hashed to other nodes, resulting in "loss").
6) the weight directly affects the hash result of key, and changing the node weight will cause some key loss.
7) the default Twemproxy is single-threaded, but most companies that use Twemproxy will do secondary development on their own and change it to multithreading.
Overall, twemproxy is still very reliable, although there is a loss of performance, but it is relatively worthwhile, and it is time-tested and widely used. Please refer to the official documentation for more and more detailed information. In addition, twemproxy is only suitable for static clusters and is not suitable for scenarios that need to dynamically add or delete nodes and manually adjust the load. If we use it directly, we need to do development and improvement work. The https://github.com/wandoulabs/codis system is based on twemproxy and adds functions such as dynamic data migration. The specific usage needs to be further tested.
Twemproxy uses the schema
The first kind: single-node Twemproxy
Ps: saves hardware resources, but is prone to a single point of failure.
The second type: highly available Twemproxy
PS: 1/2 of resources are wasted, but nodes are highly available.
The third kind: load balancing Twemproxy
PS: if you are in a large-scale Redis or Memcached application scenario, you can do the load army and scenario of Twemproxy, that is, add LVS nodes on the basis of highly available Twemproxy, and use LVS (Linux virtual server) to do Twemproxy load balancing. LVS is a four-layer load balancing technology with strong proxy capabilities. For more information, please see the LVS chapter of this blog. But when you use LVS, there is a problem with Twemproxy, a single point of failure, and it's time to make high availability for LVS. But LVS also supports load balancing, which can be done by using OSPF routing technology. And this architecture is the architecture that I am using in my current work.
In addition, no matter which of the above architectures are used, the single point of failure of Redis can not be avoided, and Redis persistence can not avoid hardware failure. If you have to ensure the uninterruptibility of Redis data access, you should use Redis cluster mode, which currently supports JAVA well and is widely used in your work.
Install Twemproxy
1. Download Twemproxy
Git clone https://github.com/twitter/twemproxy.git
2. Install Twemproxy
Twemproxy needs to be compiled and configured using Autoconf. GNU Autoconf is a tool for making configuration scripts under Bourne shell for compiling, installing, and packaging software. Autoconf is not restricted by programming languages and is commonly used in C, C++, Erlang, and Objective-C. The configuration script controls the installation of a package on a specific system. After a series of tests, the configuration script generates makefile and header files from the template to adjust the software package to a certain system. Autoconf, Automake, Libtool and other software constitute the GNU construction system. Autoconf was written by David McKenzie in the summer of 1991 to support his programming work at the Free Software Foundation. Since then, Autoconf has included improved code written by many people and has become the most widely used free-to-compile configuration software.
Let's start using autoreconf to compile the configuration for twemproxy:
[root@www twemproxy] # autoreconfconfigure.ac:8: error: Autoconf version 2.64 or higher is requiredconfigure.ac:8: the top levelautom4te: / usr/bin/m4 failed with exit status: 63aclocal: autom4te failed with exit status: 63autoreconf: aclocal failed with exit status: 63 [root@www twemproxy] # autoconf-- versionautoconf (GNU Autoconf) 2.63
It is suggested that the version of autoreconf is too low, and the above version is autoconf 2.63, so download autoconf 2.69 to compile and install it. Note that if you are CentOS6, then your default version is 2.63. if you are CentOS7, your default version should be 2.69. if you are Debian8 or Ubuntu16, then your default version should also be 2.69. Anyway, if you execute the autoreconf error, it means that the version is low and needs to be compiled and installed.
Compile and install Autoconfi [root @ www ~] # wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.69.tar.gz[root@www ~] # tar xvf autoconf-2.69.tar.gz [root@www ~] # cd autoconf-2.69 [root@www autoconf-2.69] #. / configure-- prefix=/usr [root@www autoconf-2.69] # make & & make install [root@www autoconf-2.69] # autoconf--versionautoconf (GNU Autoconf) 2.69 compile and install enable-debug=full [root @ www ~] # cd / root/twemproxy/ [root@www twemproxy] # autoreconf-fvi [root@www twemproxy] #. / configure-- prefix=/etc/twemproxy CFLAGS= "- DGRACEFUL-g-O2"-- enable-debug=full [root] # make & & make install
If the autoreconf-fvi Times makes the following error, you need to rely on libtool to install the libtool tool (if it is CentOS, you can use yum install libtool directly, if it is Debian, you can use apt-get install libtool directly).
Autoreconf: Entering directory `. 'autoreconf: configure.ac: not using Gettextautoreconf: running: aclocal-- force-I m4autoreconf: configure.ac: tracingautoreconf: configure.ac: adding subdirectory contrib/yaml-0.1.4 to autoreconfautoreconf: Entering directory `contrib/yaml-0.1.4'autoreconf: configure.ac: not using Autoconfautoreconf: Leaving directory `contrib/yaml-0.1.4'autoreconf: configure.ac: not using Libtoolautoreconf: running: / usr/bin/autoconf-- forceconfigure.ac:36: error: possibly undefined macro: AC_ PROG_LIBTOOL If this token and others are legitimate Please use m4_pattern_allow. See the Autoconf documentation.autoreconf: / usr/bin/autoconf failed with exit status: 1Twemproxy add profile [root@www twemproxy] # mkdir / etc/twemproxy/conf [root@www twemproxy] # cat / etc/twemproxy/conf/nutcracker.ymlredis-cluster: listen: 0.0.0.0 hash 22122 hash: fnv1a_64 distribution: ketama timeout: 400 backlog: 65535 preconnect: true redis: true server_connections: 1 auto_eject_hosts: true server_retry_timeout: 60000 server_failure_limit: 3 servers: -172.16.0.172 redis02 6546 redis01 1 redis02
Introduction to configuration options:
Redis-cluster: give this configuration segment a name. You can have multiple configuration segments.
Listen: setting monitoring IP and port port
Hash: specific hash function, which supports more than ten kinds of functions such as md5,crc16,crc32,finv1a_32,fnv1a_64,hsieh,murmur,jenkins. Generally, you can choose fnv1a_64, and the default is fnv1a_64.
Hash_tag:hash_tag allows you to calculate the hash value of key based on a portion of key. The hash_tag consists of two characters, one is the beginning of the hash_tag and the other is the end of the hash_tag. Between the beginning and end of the hash_tag is the part that will be used to calculate the hash value of the key, and the result of the calculation will be used to select the server. For example, if hash_tag is defined as "{}", then the hash values of "user: {user1}: ids" and "user: {user1}: tweets" are based on "user1" and will eventually be mapped to the same server. "user:user1:ids" will use the entire key to calculate the hash, which may be mapped to different servers.
Distribution: specifies the hash algorithm, which determines how the key after the above hash is distributed across multiple server. The default is "ketama" consistent hash. The ketama:ketama consistent hash algorithm constructs a hash ring based on the server and assigns a hash range to the nodes on the ring. The advantage of ketama is that after a single node is added and deleted, the cached key value in the entire cluster can be reused to the maximum extent. Modula:modula is very simple: it takes the module according to the hash value of the key value, and selects the corresponding server according to the result of the module. Random:random is that no matter what the hash of the key value is, a server is randomly selected as the target of the key operation.
Timeout: sets the timeout of twemproxy. After timeout is set, if no response is received from the server after the time of timeout, the timeout error message SERVER_ERROR Connection timeout will be sent to the client.
Backlog: the length of the backlog (connection waiting queue) that listens to TCP. The default is 512.
Preconnect: specifies whether twemproxy establishes a connection to all redis when the system starts. The default is false, a Boolean value.
Redis: specify whether this configuration segment is used as a proxy for Redis. If you do not add redis as true, you can act as a proxy for a memcached cluster (this is the only difference between Twemproxy as a redis or memcached cluster proxy)
Redis_auth: if your backend Redis enables authentication, then you need redis_auth to specify the authentication password.
The number of connections between server_connections:twemproxy and each redis server is 1 by default. If it is greater than 1, user commands may be sent to different connections, which may cause the actual execution order of commands to be inconsistent with that specified by users (similar to concurrency).
Auto_eject_hosts: if the node cannot respond, it defaults to true, but it should be noted that after the node is removed, the number of machines decreases and the hash position of the machine changes, which will cause some key to miss, but if the program connection is not removed, an error will be reported.
Server_retry_timeout: controls the time interval between server connections (in milliseconds), which takes effect when auto_eject_host is set to true. The default is 30000 milliseconds.
The number of times server_failure_limit:Redis times out in a row. If auto_eject_hosts is set to true, the Redis will be removed.
Servers: a list of addresses, ports, and weights of servers in pool, including the name of an optional server. If the name of the server is provided, it will be used to determine the order of the server, thus providing the hash ring of the corresponding consistent hash. Otherwise, the order defined by server is used, and 'host:port:weight' or' host:port:weight name' can be specified in two string formats. Generally, the second alias is used, so that when there is a problem with one of the Redis nodes, you can directly add a new Redis node without changing the server name, so that the twemproxy still uses the same server name for hash ring, so there will be no problem with the data of other data nodes (only the machine data of the hanging point is lost).
PS: strictly follow the format of the Twemproxy configuration file, otherwise there will be syntax errors; in addition, you can set up a proxy Redis cluster or a Memcached cluster in the Twemproxy configuration file at the same time, just define different configuration segments.
Start twemproxy (nutcracker)
Now that you have just added the configuration file, test the configuration file:
[root@www twemproxy] # / etc/twemproxy/sbin/nutcracker-tnutcracker: configuration file 'conf/nutcracker.yml' syntax is ok
Indicating that the configuration file has been successful, now start running nutcracker:
[root@www ~] # / etc/twemproxy/sbin/nutcracker-c / etc/twemproxy/conf/nutcracker.yml-p / var/run/nutcracker.pid-o / var/log/nutcracker.log-d option description:-h,-help # View help documentation, display command options;-V,-version # View nutcracker version -c,-conf-file=S # specifies the configuration file path (default: conf/nutcracker.yml);-p,-pid-file=S # specifies the process pid file path, which is turned off by default (default: off);-o,-output=S # sets the log output path, which defaults to standard error output (default: stderr);-d,-daemonize # runs as the daemon -t,-test-conf # tests the correctness of the configuration script;-D,-describe-stats # prints the status description;-v,-verbosity=N # sets the log level (default: 5, min: 0, max: 11);-s,-stats-port=N # sets the status monitoring port. Default is 22222 (default: 22222). -a,-stats-addr=S # sets status monitoring IP, default is 0.0.0.0 (default: 0.0.0.0);-I,-stats-interval=N # sets status aggregation interval (default: 30000 msec);-m,-mbuf-size=N # sets mbuf block size in bytes units (default: 16384 bytes)
PS: in general, in a production environment, process management tools, such as supervisor or pm2, are used for twemproxy startup management to avoid automatically pulling a process when it dies.
Verify whether [root@www ~] # ps aux is started normally | grep nutcrackerroot 20002 0.0 19312 916? Sl 18:48 0:00 / etc/twemproxy/sbin/nutcracker-c / etc/twemproxy/conf/nutcracker.yml-p / var/run/nutcracker.pid-o / var/log/nutcracker.log-droot 20006 0.0 103252 832 pts/0 S + 18:48 0:00 grep nutcracker [root@www ~] # netstat-nplt | grep 22122tcp 00 0.0.0.0 etc/twemproxy/conf/nutcracker.yml 22122 0.0.0. 0VOR * LISTEN 20002/nutcrackerTwemproxy proxy Redis cluster
Here we use the first scenario to test the Twemproxy agent Redis cluster, one twemproxy and two Redis nodes (you can add more if you want) on the same host. Twemproxy uses the above configuration, and only two additional Redis nodes are needed.
Install and configure Redis
Before installing Redis, you need to install tcl, the dependent program of Redis. If you do not install tcl, you will get an error when Redis executes make test.
[root@www ~] # yum install-y tcl [root@www ~] # wget https://github.com/antirez/redis/archive/3.2.0.tar.gz[root@www ~] # tar xvf 3.2.0.tar.gz-C / usr/local [root@www ~] # cd / usr/local/ [root@www local] # mv redis-3.2.0 redis [root@www local] # cd redis [root@www redis] # make [root@www redis] # Make test [root@www redis] # make install configure two Redis nodes [root@www ~] # mkdir / data/redis-6546 [root@www ~] # mkdir / data/redis-6547 [root@www ~] # cat / data/redis-6546/redis.confdaemonize yespidfile / var/run/redis/redis-server.pidport 6546bind 0.0.0.0loglevel noticelogfile / var/log/redis/redis-6546.log [root@www ~] # cat / data/redis-6547/redis.confdaemonize yespidfile / var/ Run/redis/redis-server.pidport 6547bind 0.0.0.0loglevel noticelogfile / var/log/redis/redis-6547.log
PS: simply provide two Redis configuration files. If Redis authentication is enabled, you also need to fill in the Redis password in twemproxy.
Start two Redis nodes [root@www ~] # / usr/local/redis/src/redis-server / data/redis-6546/redis.conf [root@www ~] # / usr/local/redis/src/redis-server / data/redis-6547/redis.conf [root@www ~] # ps aux | grep redisroot 23656 0.0 40204 3332? Ssl 20:14 0:00 redis-server 0.0.0.0:6546 root 24263 0.0 0.0 40204 3332? Ssl 20:16 0:00 redis-server 0.0.0.0 Twemproxy read and write data verified by Twemproxy 6547
First of all, the host of servers in the twemproxy configuration item should be configured correctly, and then it can be tested by connecting to port 22122 of Twemproxy.
[root@www] # redis-cli-p 22122127.0.1 set key vlaueOK127.0.0.1:22122 > get key "vlaue" 127.0.0.1 set key vlaueOK127.0.0.1:22122 > FLUSHALLError: Server closed the connection127.0.0.1:22122 > quit
Above we set a key, and then we can also get the data through twemproxy, everything is normal. But using the flushall command in twemproxy is not good, it is not supported.
Then we go to connect two redis nodes to see if the data appears on a certain node, and if so, it means that twemproxy is working properly.
[root@www] # redis-cli-p 6546127.0.1 get key > get key (nil) 127.0.0.1
From the above results, we can see that the data is stored on the 6547 node. There is currently no good way to know for sure that a key is stored on a back-end node.
How to Reload twemproxy?
Since twemproxy does not provide a startup script, it is started by command-line arguments. Therefore, it is not possible to reload twemproxy, and in a production environment, the inability of an application to reload (overloaded configuration files) is a disaster. When you add or delete nodes to twemproxy, if you directly use restart, it is bound to affect the online business. So the best way is reload, since twemproxy does not provide it, you can use the kill command with a signal, and then follow the progress number of the twemproxy main process.
Kill-SIGHUP PID
Note that PID is the twemproxy master process.
These are all the contents of the article "how to achieve Redis data fragmentation". Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.