What is the memcache cache server? 07/02 Update SLTechnology News&Howtos

What is the memcache cache server?

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

A brief introduction to 1Magnememcache

Mamcache is a distributed cache system, which can improve the speed of website access, especially for some large companies or websites that frequently access the database. Memcache is an open source free software. Memcache caches data in memory through key-value pairs, reducing the number of times to read data from back-end data.

2memcache distributed caching system

Distributed: data is stored separately on different servers.

Caching system: strictly speaking, memcache is not a nosql database, but a system that provides in-memory caching, so how do you understand that memcache is a nosql database? Relational database is based on two-dimensional table to store data (transaction, meta-ancestor, data persistence) and so on. The simplest understanding is that data will not be lost after power outage. Non-relational database does not use SQL statements to query data but stores and reads data based on key-value, so it can be understood as nosql database again.

3The difference between mysql and memcache

1) No SQL statements are used to query or store data

2) the concept of tables without mysql in memcache is saved using key-value pairs.

3) the data of memcache is stored in memory, the data reading speed is fast, and the data is lost due to power outage.

4the principle of distributed deployment of memcache

Although memcache is called "distributed cache", memcache itself does not have distributed functions, and memcache clusters do not communicate with each other. The so-called "distributed" depends entirely on the implementation of client programs, just like the flow in the following figure.

At the same time, based on this diagram, take a look at the process of writing cache once to memcache:

① applications enter data that needs to be cached by writes

② API inputs key into the routing algorithm module, and the routing algorithm gets a server number according to the list of key and memcache cluster servers.

③ gets the memcache and its ip address and port number from the server number

④ API calls the communication module to communicate with the server with the specified number, writes data to the server, and completes a write operation of distributed cache.

Read operation is the same as write cache, as long as the same routing algorithm and server list are used, as long as the application queries the same key,memcache client, the same client always accesses the same client to read the data, and as long as the data is still cached in the server, the cache hit can be guaranteed.

This way of memcache clustering is also considered from the aspect of partition fault tolerance. If node02 goes down, then the data stored on node02 is not available. Because node0 and node1 still exist in the cluster, the next time you request the key value stored in node02, it will definitely miss. At this time, you first get the data to be cached from the database, and then the routing algorithm template selects a node in node0 and node1 according to the key value. Put the corresponding data in so that you can go to the cache next time. However, the cost should be considered in the approach of the cluster.

5, routing algorithm

From the figure above, we can see that the routing algorithm is very important for the management of the server cluster, just like the load balancing algorithm, the routing algorithm determines which server in the cluster should be accessed.

1) remainder hash algorithm:

The stored key-vaule data is calculated by hash to get a value, and then divides and removes the remainder according to the data of memcache. Put the data side on the corresponding server according to the remainder, because the hash value is very random, so the data stored on the server is relatively balanced, which generally does not cause a large amount of data to be placed on only one server, but there is another problem. When you add a node, the previous data will not be read.

Solution:

(1) when the number of visits to the website is low, the technical team works overtime, expands capacity, and restarts the server.

(2) preheat the cache by simulating the request to redistribute the data in the server.

2) consistent hash algorithm:

The key to cache server hash mapping is realized through a data result called a consistent hash ring. To put it simply, the consistent hash organizes the control of the entire hash value into a virtual ring (this ring is called a consistent hash ring).

Disadvantages: when there are too few server nodes, it is easy to cause uneven node data. You can choose to add virtual nodes to solve the problem.

More importantly, the more cache server nodes in the cluster, the smaller the impact of increasing / decreasing nodes; that is to say, with the increase of the size of the cluster, the probability of continuing to hit the original cached data will become greater and greater. Although there is still a small amount of data cached in the server that cannot be read, the proportion is small enough that even if you access the database, it will not cause fatal load pressure on the database.

6 principle of realization of memcache

First of all, let's make it clear that memcache's data is stored in memory, so it has the following characteristics:

1) the speed of accessing data is faster than the traditional relational database, because the traditional relational database (mysql,oracle) in order to ensure the persistence of the data, the data is stored in the hard disk, the io operation speed is slow.

2) the data of memcache is stored in memory, and as long as memcache is restarted, the data will be lost.

3) since the data of memacache is stored in memory, it is bound to be limited by the number of machine bits. 32-bit machines can only use the memory space of 2GB at most, while 64-bit machines can be considered to have no upper limit.

Then let's take a look at the principle of memcache. The most important thing for memcache is how to allocate memory. Memcache uses a fixed space allocation method, as shown in the following figure:

This picture involves four concepts: slab_class, slab, page, and chunk. The relationship between them is:

1. MemCache divides the memory space into a set of slab

2. There are several page under each slab, and each page defaults to 1m. If a slab occupies 100m of memory, then there should be 100m page under this slab.

3. Each page contains a set of chunk. Chunk is the real place to store data, and the size of the chunk in the same slab is fixed.

4. Slab with the same size chunk is organized together, which is called slab_class

The way MemCache memory is allocated is called allocator (allocation Operation). The number of slab is limited, several, a dozen, or dozens, which is related to the configuration of startup parameters.

The storage place of the value in memcache is determined by the size of the value, and the value will always be stored in a slab with the closest chunk size, such as the chunk size of slab [1] is 80 bytes, the chunk size of slab [2] is 100 bytes, and the chunk size of slab [3] is 128 bytes. (the chunk in the adjacent slab grows by 1.25, which can be specified by-f when MemCache starts), then come over an 88-byte value This value will be placed in slab 2.

When you put it to the slab, the slab first needs to apply for memory, and the requested memory is in page, so when you put in the first data, no matter how much you bring down, 1m of page will be allocated to the slab. After applying to the page, slab will split the memory of the page into a chunk array according to the size of the chunk, and finally select one from the chunk array to store the data.

What if there is no chunk to allocate in the slab? if the memcache startup does not append-M (disable LRU, in which case insufficient memory will report an Out Of Memory error), then MemCache will clean up the data from the least recently used chunk in the slab and put the latest data.

7 the workflow of memcache

(1) check whether the request data of the client is in the memcached, and if so, return the request data directly without any operation on the database, and the path is operated as ①②③⑦.

(2) if the requested data is not in the memcached, check the database, return the data obtained from the database to the client, and cache a copy of the data to the memecached (the memcached client is not responsible, which needs to be clearly implemented by the program), and the path is operated as ①②④⑤⑦⑥.

(3) update the data in memcached every time you update the database to ensure consistency.

(4) when the memory space allocated to memcached is used up, the LRU policy plus expiration policy is used, and the invalidation data is replaced first, and then the recently unused data is replaced.

The characteristics of 8pm Memcached

The protocol is simple:

Text-based protocols: common protocols http,ftp,smtp are based on lines of text, which means that information is transmitted in text

Based on libevent event handling:

Libevent is a set of program library developed by C language. it encapsulates the event handling functions such as kqueue of BSD system (BSD is a derivative of unix) and epoll of linux system into an interface, which improves the performance compared with traditional select.

Built-in memory management:

All the data is stored in memory, and the data access speed is fast, but without considering the problem of data single point of disaster recovery, restart the service, all data will be lost.

Distributed:

Each memcached server does not communicate with each other, accesses data independently and does not share any information. The server does not have the distributed function, and the distribution depends on the memcache client.

The installation of memcache is divided into two processes: the installation of memcache server and the installation of memcached client.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.