How to analyze the principle of caching and the automatic management of microservice cache 07/06 Update SLTechnology News&Howtos

How to analyze the principle of caching and the automatic management of microservice cache

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article introduces you how to analyze the cache principle and micro-service cache automatic management, the content is very detailed, interested friends can refer to, hope to be helpful to you.

Why do I need caching?

Let's start with an old question: how does our program work?

The program is stored in disk

The program runs in RAM, which is what we call main memory.

The calculation logic of the program is executed in CPU

Let's look at the simplest example: a = a + 1

Load x:

X0 = x0 + 1

Load x0-> RAM

Three storage media are mentioned above. We all know that the speed of three types of reading and writing is inversely proportional to the cost, so we need to introduce a middle layer to overcome the problem of speed. This middle tier requires high-speed access, but the cost is acceptable. So, Cache was introduced.

In computer systems, there are two default caches:

The last cache in CPU, that is, LLC. Cache data in memory

The cache of pages in memory, or page cache. Cache data on disk

Cache read and write strategy

With the introduction of Cache, let's move on to see what happens to the operation cache. Because there is a difference in access speed "and very different", when manipulating data, delays or program failures will lead to inconsistency between the cache and the actual storage layer data.

Let's take a look at the classic reading and writing strategies and application scenarios with standard Cache+DB.

Cache Aside

First, consider the simplest business scenario, such as user table: userId: user id, phone: user phone token,avtoar: user avatar url. In the cache, we use phone as the key to store the user avatar. What should I do when the user modifies the avatar url?

Update DB data, then update Cache data

Update DB data, and then delete Cache data

First of all, changing the database and changing the cache are two separate operations, and we do not do any concurrency control on the operation. Then when two threads update them concurrently, the data will be inconsistent because of the different writing order.

So a better solution is 2:

Do not update the cache when updating the data, but delete the cache directly

Subsequent requests find that the cache is missing, go back to query DB, and load cache the result

This strategy is the most common strategy we use caching: Cache Aside. The policy data is based on the data in the database, and the data in the cache is loaded on demand, which is divided into read strategy and write strategy.

But there is a visible problem: frequent read and write operations will cause Cache to be replaced repeatedly and cache hit ratio will be reduced. Of course, if there is a monitoring alarm on the hit rate in the business, you can consider the following options:

Update the cache as you update the data, but add a distributed lock before updating the cache. In this way, only one thread operates the cache at the same time, which solves the concurrency problem. At the same time, the latest cache is read in subsequent read requests, which solves the problem of inconsistency.

Update the cache as you update the data, but give the cache a shorter TTL.

Of course, in addition to this strategy, there are several other classic caching strategies in the computer system, and they also have their own applicable usage scenarios.

Write Through

First query whether the write data key hits the cache, and if the cache is updated at-> and the cache component synchronizes the data to the DB; does not exist, the Write Miss will be triggered.

In general, there are two ways of Write Miss:

Write Allocate: assign Cache line directly when writing

No-write allocate: write directly to DB,return without writing to the cache

In Write Through, No-write allocate is generally adopted. Because in fact, either way, the final data will be persisted to DB, eliminating one-step cached writes and improving write performance. The cache is written to the cache by Read Through.

The core principle of this strategy: the user only deals with the cache, and the cache component communicates with the DB, writes or reads data. This strategy can be considered by some local process caching components.

Write Back

I believe you can also see the shortcomings of the above scheme: when writing data, the cache and the database are synchronized, but we know that the speed difference between the two storage media is several orders of magnitude, which has a great impact on write performance. Do we update the database asynchronously?

Write back is to update only the data corresponding to that Cache Line when writing data and mark the row as Dirty. Dirty is written to storage when the data is read or when the cache replacement policy is swapped out when the cache is full.

It is important to note that in the case of Write Miss, Write Allocate is taken, that is, the write storage is written to the cache at the same time, so that we only need to update the cache for subsequent write requests.

The concept of async purge actually exists in the computer system. The essence of brushing dirty pages in Mysql is to prevent random writing as much as possible and to unify the timing of disk writing.

Redis

Redis is an independent system software, and the business program we write is two software. When we deploy the Redis instance, it only passively waits for the client to send the request before processing it. Therefore, if the application wants to use Redis caching, we need to add the corresponding cache operation code to the program. So we also call Redis bypass caching, that is, reading the cache, reading the database, and updating the cache all need to be done in the application.

Redis, as a cache, also needs to face common problems:

The capacity of the cache is limited after all.

Upstream concurrent request impact

Cache is consistent with backend storage data

Replacement strategy

Generally speaking, the cache will choose whether to delete or write back to the database directly according to whether the selected eliminated data is clean or dirty. However, in Redis, obsolete data is deleted regardless of whether it is clean or not, so this is what we should pay special attention to when using Redis caching: when the data is modified to dirty data, the data needs to be modified in the database as well.

So no matter what the replacement strategy is, dirty data is likely to be lost during swapping in and out. Then we should delete the cache instead of updating the cache when we generate dirty data, and all data should be based on the database. It is also understandable that cache writes should be handed over to read requests; write requests are as consistent as possible.

As for the replacement strategies, there have been many articles on the Internet to summarize the advantages and disadvantages, so I won't repeat them here.

ShardCalls

In concurrent scenarios, multiple threads (collaborators) may request the same resource at the same time. If each request goes through the resource request process, it will not only be inefficient, but also put concurrent pressure on the resource service.

ShardCalls in go-zero can make multiple requests only need to make one call to get the result at the same time, and other requests can "sit back and enjoy the success". This design effectively reduces the concurrency pressure on resource services and can effectively prevent cache breakdown.

To prevent the explosion of interface requests from causing instantaneous high load on downstream services, you can package it in your function:

Fn = func () (interface {}, error) {/ / Business query} data, err = g.Do (apiKey, fn) / / to get the data, and then the method or logic can use this data

In fact, the principle is also very simple:

Func (g * sharedGroup) Do (key string, fn func () (interface {}, error)) (interface {}, error) {/ / done: false, will execute the following business logic For true Directly return the previously obtained data c, done: = g.createCall (key) if done {return c.val, c.err} / / execute the business logic g.makeCall (c, key, fn) return c.val, c.err} func (g * sharedGroup) createCall (key string) (c * call) passed in by the caller Done bool) {/ / only let one request in for operation g.lock.Lock () / / if the key with a series of requests already exists in the map calls / / then unlock and wait for the previous request to obtain data, and return if c, ok: = g.calls [key] Ok {g.lock.Unlock () c.wg.Wait () return c, true} / indicates that this request is the first time that c = new (call) c.wg.Add (1) / / is marked. Because the lock is held, there is no need to worry about concurrency g.calls [key] = c g.lock.Unlock () return c, false}

This map+lock stores and restricts request operations, similar to singleflight in groupcache, is a powerful tool to prevent cache breakdown

> Source code address: sharedcalls.go

Cache and store update order

This is a common entangled question in development: do you want to delete the cache or update the storage first?

> case 1: delete cache before updating storage; >-A delete cache, network delay when updating storage >-B read request, cache missing, read storage-> read old data at this time

This creates two problems:

B read the old value

B simultaneous read requests write the old values to the cache, causing subsequent read requests to read to the old values

Since the cache may be an old value, it doesn't matter if it is deleted. There is an unelegant solution: after the write request updates the stored value, sleep () performs a cache delete operation for a short period of time.

Sleep is to ensure that the read request ends, and the write request can delete the cache dirty data caused by the read request, taking into account the time consuming of redis master-slave synchronization. But it still depends on the actual business.

After the cache value is deleted for the first time, this scheme will delay the deletion for a period of time, which is called delayed double deletion.

> case 2: first update the database value, and then delete the cache value: > >-A delete the stored value, but delete the cache network delay >-B read request, the cache hits and directly returns the old value.

This situation has little impact on the business, and the vast majority of cache components adopt this update order to meet the final consistency requirements.

> case 3: new users register and write directly to the database, but certainly not in the cache. If the program reads the slave library at this time, the user data cannot be read due to the master-slave delay.

This situation requires operations such as Insert: inserting new data into the database while writing to the cache. It enables subsequent read requests to read the cache directly, and because it is new data that has just been inserted, it is unlikely to be modified over a period of time.

The above programs have potential problems more or less in complex situations and need to be modified in line with the business.

How to design a useful cache operation layer?

With all that said above, back to our development point of view, it's obviously too troublesome if we need to think about so many issues. So how to encapsulate these caching and replacement strategies to simplify the development process?

The following points are clear:

Separate the business logic from the cache operation, leaving the development of this write logic point

Caching operations need to consider traffic shocks, caching strategies and other issues.

Let's talk about how go-zero is encapsulated from both reading and writing perspectives.

QueryRow// res: query result// cacheKey: redis keyerr: = m.QueryRow (& res, cacheKey, func (conn sqlx.SqlConn, v interface {}) error {querySQL: = `select * from your_table where campus_id =? And student_id =? `return conn.QueryRow (v, querySQL, campusId, studentId)})

We will develop query business logic encapsulated in func (conn sqlx.SqlConn, v interface {}). Users do not need to worry about cached writes, they only need to pass in the cacheKey that needs to be written. At the same time, the query result res is returned.

So how is the cache operation encapsulated internally? Take a look inside the function:

Func (c cacheNode) QueryRow (v interface {}, key string, query func (conn sqlx.SqlConn, v interface {}) error) error {cacheVal: = func (v interface {}) error {return c.SetCache (key, v)} / / 1.cache hit-> return / / 2.cache miss-> err if err: = c.doGetCache (key, v) Err! = nil {/ / 2.1 err defalut val {*} if err = = errPlaceholder {return c.errNotFound} else if err! = c.errNotFound {return err} / / 2.2 cache miss-> query db / / 2.2.1 query db return err {NotFound}-> return err defalut val "see 2.1" if err = query (c.db, v); err = c.errNotFound {if err = c.setCacheWithNotFound (key) Err! = nil {logx.Error (err)} return c.errNotFound} else if err! = nil {c.stat.IncrementDbFails () return err} / / 2.3 query db success-> set val to cache if err = cacheVal (v); err! = nil {logx.Error (err) return err}} / / 1.1 cache hit-> IncrementHit c.stat.IncrementHit () return nil}

The process corresponds exactly to: Read Through in the caching strategy.

> Source code address: cachedsql.go

Exec

On the other hand, the write request uses Cache Aside-> from the previous cache strategy to write to the database before deleting the cache.

_, err: = m.Exec (func (conn sqlx.SqlConn) (result sql.Result, err error) {execSQL: = fmt.Sprintf ("update your_table set% s where 1: 1", m.table, AuthRows) return conn.Exec (execSQL, data.RangeId, data.AuthContentId)}, keys...) func (cc CachedConn) Exec (exec ExecFn, keys... string) (sql.Result, error) {res, err: = exec (cc.db) if err! = nil {return nil Err} if err: = cc.DelCache (keys...) Err! = nil {return nil, err} return res, nil}

Like QueryRow, the caller is only responsible for the business logic, and caching writes and deletes are transparent to the invocation.

> Source code address: cachedsql.go

Online caching

The first sentence at the beginning: it is a hooligan to break away from the business. All of the above is an analysis of the cache mode, but does the cache play an accelerated role in the actual business? The most intuitive is the cache hit rate, but how to observe the cache hit of the service? This involves surveillance.

The following figure shows the cache records of a service in our online environment:

Remember in QueryRow above: if the query cache is hit, c.stat.IncrementHit () will be called. Among them, stat is used as a monitoring indicator, constantly calculating the hit rate and failure rate.

> Source code address: cachestat.go

In other business scenarios, such as home page information browsing, a large number of requests are inevitable. So caching the information on the front page is particularly important in the user experience. But unlike some of the single key mentioned earlier, a large number of messages may be involved, and other cache types need to be added at this time:

Split cache: you can split the message id-> query the message by the message id, and insert the cache into the message list.

Message expiration: set the message expiration time so that it does not take up too long a cache.

Here are the best practices involving caching:

It is "particularly important" not to allow non-expired caches

Distributed cache, easy to scale

Automatic generation, with its own statistics

On how to analyze the cache principle and micro-service cache automatic management is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.