How to analyze the deletion mechanism and development direction of memcached 10/21 Update SLTechnology News&Howtos

How to analyze the deletion mechanism and development direction of memcached

2025-10-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article shows you how to analyze the deletion mechanism and development direction of memcached, the content is concise and easy to understand, it can definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

Memcached is cached, so data is not permanently stored on the server, which is a prerequisite for introducing memcached into the system. The following describes memcached's data deletion mechanism, as well as the latest development direction of memcached-binary Protocol (Binary Protocol) and external engine support.

Memcached makes effective use of resources in data deletion data will not really disappear from memcached

As mentioned last time, memcached does not release allocated memory. After the record times out, the client can no longer see the record (invisible, transparent) and its storage space can be reused.

Lazy Expiration

Memcached internally does not monitor whether the record is out of date, but instead looks at the record's timestamp during get to check whether the record is out of date. This technique is called lazy (inert) expiration. Therefore, memcached does not waste CPU time on expiration monitoring.

LRU: the principle of effectively deleting data from the cache

Memcached will give priority to the space of records that have timed out, but even so, there will be insufficient space when new records are appended, so a mechanism called Least Recently Used (LRU) is used to allocate space. As the name implies, this is the mechanism for deleting "least recently used" records. Therefore, when memcached runs out of memory space (when new space cannot be obtained from slab class), it searches for recently unused records and allocates its space to new records. From a practical point of view of caching, the model is ideal.

However, in some cases, the LRU mechanism can cause trouble. LRU can be disabled when memcached starts with the "- M" parameter, as shown below:

$memcached-M-m 1024

It is important to note at startup that the lowercase "- m" option is used to specify the maximum memory size. If you do not specify a specific value, the default value 64MB is used.

After specifying the "- M" parameter to start, memcached will return an error when memory is exhausted. After all, memcached is not a memory, but a cache, so it is recommended to use LRU.

The latest Development Direction of memcached

There are two big goals on memcached's roadmap. One is the planning and implementation of binary protocols, and the other is the loading function of external engines.

About binary protocol

The reason for using the binary protocol is that it does not need the parsing processing of the text protocol, so that the performance of the original high-speed memcached can be climbed to a higher floor, and the loopholes of the text protocol can be reduced. At present, most of them have been implemented, and this function has been included in the code base used for development. There is a link to the code base on the download page of memcached.

Http://danga.com/memcached/download.bml

Format of binary protocol

The packet of the protocol is a 24-byte frame followed by keys and unstructured data (Unstructured Data). The actual format is as follows (quoted from the agreement document):

Byte/ 0 | 1 | 2 | 3 | / | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | +-- -+ 0 / HEADER / / /- +-+ 24 / COMMAND-SPECIFIC EXTRAS (as needed) / + / (note length in th extras length header field) / +- -+ m / Key (as needed) / + / (note length in key length header field) / +- -+-+ n / Value (as needed) / + / (note length is total body length header field) Minus / + / sum of the extras and key length body fields) / +-+ Total 24 bytes

As shown above, the package format is very simple. It should be noted that the header (HEADER) that occupies 16 bytes is divided into request headers (Request Header) and response headers (Response Header). The header contains Magic bytes, command type, key length, value length and other information indicating the validity of the package. The format is as follows:

Request Header Byte/ 0 | 1 | 2 | 3 | / | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | +-- -+ 0 | Magic | Opcode | Key length | +-+ -+ 4 | Extras length | Data type | Reserved | +-+ 8 | Total body length | | +-+ 12 | Opaque | +-| -+ 16 | CAS | + -+ Response Header Byte/ 0 | 1 | 2 | 3 | / | | | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | +-0 | Magic | Opcode | | | Key Length | +-+ 4 | Extras length | Data type | Status | +-- | -+ 8 | Total body length | +- -+-+ 12 | Opaque | +-+ 16 | CAS | | +-+

If you want to know the details of each part, you can checkout the binary protocol code tree of memcached and refer to the protocol_binary.txt document in the docs folder.

What stands out in HEADER

After seeing the HEADER format, my impression is that the upper limit of the key is too large! In the current memcached specification, the maximum key length is 250 bytes, but the key size in the binary protocol is represented by 2 bytes. Therefore, in theory, a maximum of 65536 bytes of keys can be used. Although keys of more than 250 bytes are not very common, huge keys can be used after the release of the binary protocol.

Binary protocols are supported starting with the next version 1.3 series.

External engine support

I experimentally transformed the storage layer of memcached into pluggable last year.

Http://alpha.mixi.co.jp/blog/?p=129

MySQL's Brian Aker saw the modification and sent the code to memcached's mailing list. The developers of memcached are also very interested and put it into roadmap. It is now developed by me in collaboration with Trond Norbye, the developer of memcached (specification design, implementation, and testing). Jet lag with foreign collaborative development is a big problem, but with the same vision, we can finally publish the prototype of the extensible architecture. The code base can be accessed from the download page of memcached.

The need for external engine support

There are many memcached-derived software in the world because they want to preserve data permanently, achieve data redundancy, and so on, even at the expense of some performance. Before I developed memcached, I also considered reinventing memcached in mixi's R & D department.

The loading mechanism of external engine can encapsulate the complex processing of memcached, such as network function, event handling and so on. As a result, at this stage, the difficulty of working with memcached and storage engines through enforcement or redesign will disappear, and it will be easy to try various engines.

The key to the success of simple API design

In this project, we pay most attention to API design. Too many functions will make engine developers feel troublesome; if it is too complex, the threshold for implementing the engine will be too high. As a result, there were only 13 interface functions in the original version. The details are limited to space, so I'll omit it here and just explain what the engine should do:

Engine information (version, etc.)

Engine initialization

Engine shut down

Statistics for the engine

In terms of capacity, test whether a given record can be saved

Allocate memory for item (record) structure

Free the memory of item (record)

Delete record

Save records

Recovery record

Update the timestamp of the record

Mathematical operation processing

Flush of data

Readers who are interested in detailed specifications can checkout engine the code of the project and engine.h in the reader.

Re-examine the current system

The difficulty with memcached supporting external storage is that the code related to network and event processing (the core server) is closely related to the code stored in memory. This phenomenon is also known as tightly coupled (tight coupling). The code stored in memory must be separated from the core server in order to flexibly support external engines. Therefore, based on the API,memcached we designed, it is reconstructed as follows:

After refactoring, we compare the performance with version 1.2.5 and the supporting version of binary protocol, and confirm that it will not affect the performance.

When considering how to support external engine loading, it is easiest to let memcached do parallel control (concurrency control), but for the engine, parallel control is the true meaning of performance, so we have adopted the design scheme of handing over multithreading support to the engine completely.

Future improvements will make the scope of application of memcached more extensive.

The above content is how to analyze the deletion mechanism and development direction of memcached. Have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.