The function and implementation of slicing in redis 07/04 Update SLTechnology News&Howtos

The function and implementation of slicing in redis

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Partitioning is the process of splitting your data into multiple Redis instances so that each instance will contain only a subset of all keys. The first part of this article will introduce you to the concept of sharding, and the second part will show you the options for Redis sharding.

What can slicing do?

The fragmentation of Redis has two main goals:

1. Allow the sum of the memory of many computers to support larger databases. Without sharding, you are limited to the amount of memory that a single computer can support.

2. Allow scaling computing power to multi-core or multi-server, and network bandwidth to multi-server or multi-network adapter.

Slicing foundation

There are many different slicing standards (criteria). Suppose we have four Redis instances, R0Query R1, R2Magee R3, and many keys that represent users, such as user:1,user:2, … Wait, we can find different ways to choose which instance a specified key is stored in. In other words, there are many different ways to map a key to a specified Redis server.

One of the easiest ways to perform sharding is range sharding (range partitioning), which is done by mapping the scope of an object to a specified Redis instance. For example, I can assume that users enter instance R0 from ID 0 to ID 10000, users enter instance R1 from ID 10001 to ID 20000, and so on.

This approach works and is actually used in practice, but there is a drawback that requires a table that maps the scope to the instance. This table needs to be managed, and different types of objects need a table, so range sharding is often not desirable in Redis because it is much less efficient than shredding options for him.

An alternative to range sharding is hash fragmentation (hash partitioning). This mode applies to any key and does not require a key like object_name: a hungry form like this, as simple as this:

1. Use a hash function (for example, the crc32 hash function) to convert the key name to a number. For example, if the key is foobar,crc32 (foobar), something similar to 93024922 will be output.

2. Do a modular operation on this data to convert it into a number between 0 and 3, so that this number can be mapped to one of my four Redis instances. 93024922 module 4 equals 2, so I know that my key foobar should be stored in the R2 instance. Note: the modular operation returns the remainder of the division operation, which is always implemented as the% operator in many programming languages.

There are many other ways to slice, as you can see from these two examples. An advanced form of hash fragmentation is called consistent hash (consistent hashing), which is implemented by some Redis clients and agents.

Different implementation of slicing

Fragmentation can be undertaken by different parts of the software stack.

1. Client Client side partitioning means that the client directly selects the correct node to write and read the specified key. Many Redis clients implement client sharding.

2. Agent assisted sharding (Proxy assisted partitioning) means that our client sends a request to an agent that can understand the Redis protocol, rather than sending the request directly to the Redis instance. According to the configured sharding mode, the agent ensures that our request is forwarded to the correct Redis instance and the response is returned to the client. The agent Twemproxy of Redis and Memcached implements agent-assisted sharding.

3. Query routing (Query routing) means that you can send your query to a random instance, which ensures that your query is forwarded to the correct node. The Redis cluster implements a hybrid form of query routing with the help of the client (the request is not forwarded directly from the Redis instance to another, but the client receives the redirection to the correct node).

Disadvantages of slicing

Some of the features of Redis don't work very well with sharding:

1. Operations involving multiple keys are usually not supported. For example, you can't perform an intersection of keys mapped on two different Redis instances (there is actually a way to do that, but not directly).

2. Transactions involving multiple keys cannot be used.

3. The granularity of sharding is a key, so you can't use a large key to fragment a dataset, such as a large ordered set.

4. When sharding is used, data processing becomes more complex, for example, you need to deal with multiple RDB/AOF files, and when you back up data, you need to aggregate persistent files from multiple instances and hosts.

5. Adding and removing capacity is also complex. For example, Redis clusters have the ability to dynamically add and remove nodes at run time to support transparent rebalancing of data, but other ways, such as client sharding and agents, do not support this feature. However, there is a technique called pre-slicing (Presharding) that can help.

Data storage or cache

Although the sharding concept of Redis is the same whether Redis is used as a data store or cache, there is an important limitation when it comes to data storage. When Redis is stored as data, a given key is always mapped to the same Redis instance. When Redis is used as a cache, if one node is not available and another node is used, this is not a big problem, according to our desire to change the mapping of keys and instances to improve the availability of the system (that is, the ability of the system to reply to our queries).

Consistent hash implementations are often able to switch to other nodes when the preferred node of the specified key is not available. Similarly, if you add a new node, some of the data will start to be stored on the new node.

The main concepts here are as follows:

1. If Redis is used as a cache, it is easy to use consistent hashes to implement scaling up and down.

2. If Redis is used as storage, fixed key-to-node mapping is used, so the number of nodes must be fixed and cannot be changed. Otherwise, when adding or deleting nodes, you need a system that supports rebalancing keys between nodes. Currently, only the Redis cluster can do this, but the Redis cluster is still in the beta stage and has not yet considered using it in the reproduction environment.

Preslicing

We already know that a problem with sharding is that unless we use Redis as a cache, adding and deleting nodes is tricky, and using fixed keys and instance mappings is much easier.

However, the requirements for data storage may be changing all the time. I can accept 10 Redis nodes (instances) today, but I may need 50 nodes tomorrow.

Because Redis has very little footprint and is lightweight (an idle instance only uses 1MB memory), a simple solution is to open a lot of instances in the first place. Even if you start with only one server, you can decide to live in a distributed world on the first day, using sharding to run multiple Redis instances on one server.

You can choose a large number of instances from the start. For example, 32 or 64 instances can satisfy most users and provide enough space for future growth.

So, when your data storage needs to grow and you need more Redis servers, all you have to do is simply move the instance from one server to another. When you add the first server, you need to move half of the Redis instances from the first server to the second server, and so on.

With Redis replication, you can move data in little or no downtime:

1. Start an empty instance on your new server.

2. Move the data and configure the new instance as the slave service of the source instance.

3. Stop your client.

4. Update the server IP address configuration of the moved instance.

5. Send SLAVEOF NO ONE commands to the slave node on the new server.

6. Start your client with a new updated configuration.

7. Finally, close the instances that are no longer in use on the old server.

Implementation of Redis slicing

Redis clustering is the preferred method for automatic sharding and high availability. Currently, it cannot be fully used in a production environment, but it has entered the beta phase.

Once Redis clusters are available and clients that support Redis clusters are available, Redis clusters will become the de facto standard for Redis fragmentation.

Redis cluster is a mixed mode of query routing and client fragmentation.

Twemproxy is an agent developed by Twitter that supports Memcached ASCII and Redis protocols. It is single-threaded, written in C language, and runs very fast. It is an open source project based on the Apache 2.0 license.

Twemproxy supports automatic sharding among multiple Redis instances, and optional node exclusion support if nodes are not available (this changes the mapping of keys and instances, so you should only use this feature when using Redis as a cache).

This is not a single point of failure (single point of failure), because you can start multiple agents and have your client connect to the first agent that accepts the connection.

The alternative to Twemproxy is to use a client that implements client sharding through consistent hashing or other similar algorithms. There are multiple Redis clients that support consistent hashes, such as Redis-rb and Predis.

The above are the details of redis fragments, please pay more attention to other related articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.