Web architecture: ultra-detailed analysis of varnish cache proxy server 07/02 Update SLTechnology News&Howtos

Web architecture: ultra-detailed analysis of varnish cache proxy server

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Xiaosheng blog: http://xsboke.blog.51cto.com

-Thank you for your reference. If you have any questions, you are welcome to communicate.

Catalogue

Introduction to varnishi

Varnish configuration composition

-- vcl built-in preset variables

Functional statements and objects

-- built-in subroutine in varnish

-- the relationship between varnish cache mode and subroutine

Installation of varnish

Analysis of varnish instance configuration

Start varnish

Varnish acl configuration parsing

A brief introduction to Varnish

1. Action

Web Application Accelerator as well as http reverse caching proxy

two。 Characteristics

Varnish can use memory or hard disk for data caching

Support the use of virtual memory

Have a precise time management mechanism

State engine architecture: design different statements through a specific configuration language

Manage cached data in binary heap format

3. Advantages of Varnish

Varnish access is fast because of the "Visual Page Cache" technology, which reads data directly from memory when reading.

Varnish supports more concurrent connections because TCP connections in varnish are faster than squid

Varnish uses regular expressions to batch clear part of the cache through the management port

4. Disadvantages of Varnish

Once the process is crash or restarted, the cached data will be completely freed from memory

When multiple varnish load balancers implement load balancing, each request will fall into a different varnish server, causing the url request to penetrate to the backend

1) inferiority solution

A. add a squid/nignx proxy to the back end of the varnish to prevent a large number of requests from being sent to the web server when the varnish cache is cleared

B. Do url hash on load balancer, so that a single url request is routinely requested to a varnish server.

5. Composition of Varnish

1) Management process (management process)

Manage the child process and compile the vcl configuration

2) Child process (child process)

Generate a thread pool, which is responsible for processing user requests

6. Varnish configuration composition

L backend configuration: specify backend server

L ACL configuration: add access control lists for varnish for rule settings

L Probes configuration: implement health check of backend server

L Directors configuration: add a cluster for varnish

Core subprocesses: add functions such as backend server, cache, access control, error handling, etc.

I. introduction to arnish configuration

1. Vcl built-in preset variables

These variables are generally used to set the object values of each stage.

The preset variables are fixed by the system and are generated after the request is entered into the VCL subroutine. These variables can be easily extracted or customized by the subprocess.

The format is generally: stage. Object operator value

1) Pha

Req: used when processing requests sent by the client

Bereq: used when processing requests sent by varinish to the back-end server

Beresp: used when processing backend server responses, before varnish is cached

Resp: used when processing responses returned to the client

Obj: used when working with objects stored in memory

2) object

3) operator

two。 Functional statements and objects

General functional statements are used to match objects, that is, what operations are performed on an object

L format is: function statement (object)

1) functional statement

Ban (): clears the specified object cache

Call (): call subroutine

Hash_data (): generates a hash key, which can only be used in vcl_ hash subroutines

New (): create a vcl object that can only be used in vcl_ in subroutines

Return (): ends the current subroutine and executes the next action

Rollback (): restore the http header to its original state and now use std.rollback () instead.

.synthetic (): synthesizer, used to customize a response content, can only be used in vcl_synth and vcl_backend_ error subroutines

Regsub (character to be processed, regular expression, character replaced by): replace the string that appears for the first time with regular

Regsuball (character to be processed, regular expression, character replaced by): replaces all strings with regular

2) Common actions of return

Syntax: return (action)

Abandon: discards processing and generates an error.

Deliver: delivery processing

Fetch: fetches the response object from the backend

Hash: hash cache processing

Lookup: looks up the reply data from the cache and returns, and if it cannot be found, the pass function is called to call the data from the back-end server.

Ok: continue execution

Pass: bypass the cache and call data directly to the back-end server

Pipe: establish a direct connection between the client and the back-end server, calling data from the back-end server

Purge: clear cached objects and build responses

Restart: start over

Retry: retry backend processing

Synth (status code,reason): synthesis returns client status information

3. Built-in subroutine in varnish

A child process is also called a state engine, and each state engine has its own defined return action return (action); different actions will be called to the next state engine.

We can divide a request into multiple phases, each of which calls a different state engine to operate, so that we can control each request phase as long as we write the corresponding state engine.

Varnish built-in subroutines all have their own defined return action return (action); different actions will call the corresponding next subroutine

Each built-in subroutine needs to be defined by the keyword sub

1) vcl_ recv subroutine

2) vcl_ Pope subroutine

3) vcl_ pass subroutine

4) vc_ it subroutine

5) vcl_ miss child process

6) vcl_ hash child process

7) acl_ purge child process

8) vcl_ delivery child process

9) vcl_backend-- fetch subroutine

10) vcl_backend_ response subroutine

11) vcl_backend_ error subroutine

12) vcl_ synth subroutine

13) vcl_ in child process

14) acl_ Fini child process

4. The relationship between varnish cache mode and subroutine

The configuration file of l varnish is composed of each subprogram, and when varnish is running, it is also through the configuration of the subroutine to perform the corresponding operation.

The relationship of the subroutine is as follows

1) the two figures in the upper right corner represent the execution of the vcl_ in subroutine when vcl is loaded and the vcl_ Fini subroutine when vcl is unloaded

2) when vcl_recv calls the hash function

After entering this state, the hash key value is generated according to the requested url or other information through the vcl_ hash subroutine.

Then find the cache data with the same hash key value. If found, enter the val_hit state, otherwise enter the vcl_miss state.

3) when vcl_recv calls the pass function

When vcl_recv calls the pass function, pass forwards the current request directly to the back-end server. Subsequent requests are still processed through varnish.

Pass (varnish) usually only deals with static pages. That is, calling the pass function is appropriate only in requests of GET and HEAD types. In addition, it is important to note that the pass schema cannot handle POST requests. Why? Because the POST request usually sends data to the server, the server needs to receive the data, process the data and feedback the data. Is dynamic and does not cache

4) when vcl_recv determines that the pipe function needs to be called

When vcl_recv determines that the pipe function needs to be called, varnish establishes a direct connection between the client and the server, and then all requests from the client are sent directly to the server, bypassing the varnish, and the varnish no longer checks the request until the connection is broken.

Pipe is used when the type is POST, for example, when the client requests a video file, or a large document, such as .zip .tar file, needs to use pipe mode, these large files are not cached in varnish.

5) when vcl_recv specifies purge mode

Purge mode is used to clear the cache

5. Elegant mode garce mode

1) request a merge

When several clients request the same page, varnish sends only one request to the back-end server, then suspends the other requests and waits for the result to be returned

2) problem

If thousands or more of these requests occur at the same time, the waiting queue will become large, which will lead to two types of potential problems: thundering herd problem, in which a large number of threads are suddenly released to replicate the results returned by the back end, resulting in a sharp increase in load; no user likes to wait

3) solve the problem

Configure varnish to retain the cache object for a period of time after the cache object expires due to a timeout to return the past file contents (stale content) to those waiting requests.

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

Case study:

Sub vcl_recv {

If (! Req.backend.healthy) {judge if unhealthy

Set req.grace = 5m position varnish provides the front end with 5 minutes of expired content

}

Else {if healthy

Set req.grace = 15sposition varnish provides 15 seconds expired content to the previous paragraph

}

Sub vcl_fetch {

Set beresp.grace = 30m; keep invalid cache objects for another 30 minutes

}

III. Installation of varnish

1. Download the varnish package

There are two places to download

1) the latest version of the software can be downloaded from Varnish's official website http://varnish-cache.org.

But sometimes varnish's official website is blocked.

2) GIT download: git clone https://github.com/varnish/Varnish-Cache/var/tmp/

But you need to use. / autogen.sh to generate the configure compilation configuration file before installation

2. Installation of Varnish

Install the dependency package first

Configure varnish

Compile and install

Copy the vcl file

The official vcl configuration file does not prompt much configuration information, so you still need to configure it yourself in the production environment.

IV. Varnish VCL instance configuration resolution

Topological environment

5. Start varnish

VI. Varnish vcl configuration parsing

Varnish has its own programming syntax when vcl,varnish starts, it compiles the configuration file into C and then executes

1. Back-end server address pool configuration and back-end server health check

1) backend server definition, which is used for varnish to connect to the specified backend server

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

Example: define the monitoring block in the back-end server

Backend web {

.host = "192.168.31.83"; specify the IP of the backend server

.port = "80"; exposed port of the backend server

.probe = {directly append the monitoring block. Probe is a parameter

.url = "/"

.timeout = 2s

}

Or: define the monitoring block first, and then define the list of backend servers

Probe web_probe {Monitoring must be defined in front, otherwise the monitoring block cannot be found in the back-end call.

.url = "/"

.timeout = 2s

}

Backend web {

.host = "192.168.31.83"

.port = "80"

.probe = web_probe; calls the external common monitoring block

}

2) Monitor definition

one

two

three

four

five

six

seven

eight

Example: create a health monitor and define the name of the health check as backend_healthcheck

Probe backend_healthcheck {create a health check called backend_healthcheck

.url = "/"; the monitoring entry address is /

.timeout = 1s; request timeout

.interval = 5s; interval of 5 seconds per poll

.window = 5; polling 5 times

.threshold = 3; it must be polled three times before the node is normal.

}

3) load balancer cluster directors

Load balancing cluster needs the support of directors module, import directors

Algorithms supported by Directors load balancing:

Weight values must be configured to increase the probability of using random,hash.

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

thirty-one

thirty-two

thirty-three

thirty-four

thirty-five

thirty-six

Example:

Load directors module

Import directors

Configure back-end servers

Backend web1 {

.host = "192.168.0.10"

.port = "80"

.probe = backend_healthcheck

}

Backend web2 {

.host = "192.168.0.11"

.port = "80"

.probe = backend_healthcheck

}

Initialization processing

Sub vcl_init {

Call the vcl_init initialization subroutine to create a back-end host group, namely directors

New web_cluster = directors.round_robin ()

Use the new keyword to create a drector object and use the round_robin algorithm

Web_cluster.add_backend (web1)

Add a back-end server node

Web_cluster.add_backend (web2)

}

Start processing the request

Sub vcl_recv {

Call the vcl_ recv subroutine to receive and process requests

Set req.backend_hint = web_cluster.backend ()

Select the backend

}

Description:

The set command is to set variables

The unset command deletes a variable

Web_cluster.add_backend (backend, real); add the backend server node, backend is the backend configuration alias, real is the weight value, and the random probability formula: 100 * (current weight / total weight).

Req.backend_hint is a predefined variable of varnish that specifies the request backend node.

Vcl objects need to be created using the new keyword, all creatable objects are internal, import is required before use, and all new operations can only be performed in the vcl_ in subroutine.

two。 Access control list (acl)

Create an address list for later judgment

If the list contains an unresolvable host address, it matches any address.

If you don't want to match the IP, add one in front! Can

3. Cache rule settin

Description:

X-Forwarded-For is a HTTP request header field used to identify the original IP address of a client connected to a Web server through a HTTP proxy or load balancer.

If you want the backend server to record the real IP of the client, you can only set it in varnish. You also need to modify the configuration of the backend web server (in this case, apache is used as the backend web server):

Modify the variable in the box and specify it as the variable set in varnish

7. Varnish sends different url to different backend server

AutoIt Code

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

thirty-one

thirty-two

thirty-three

thirty-four

thirty-five

thirty-six

thirty-seven

thirty-eight

thirty-nine

forty

forty-one

forty-two

forty-three

forty-four

forty-five

forty-six

forty-seven

forty-eight

forty-nine

fifty

fifty-one

fifty-two

fifty-three

fifty-four

fifty-five

fifty-six

fifty-seven

fifty-eight

Import directors; # load the directors

Backend web1 {

.host = "192.168.0.10"

.port = "80"

.probe = backend_healthcheck

}

Backend web2 {

.host = "192.168.0.11"

.port = "80"

.probe = backend_healthcheck

}

Define back-end server web1 and web2

Backend img1 {

.host = "img1.lnmmp.com"

.port = "80"

.probe = backend_healthcheck

}

Backend img2 {

.host = "img2.lnmmp.com"

.port = "80"

.probe = backend_healthcheck

}

Define back-end server img1 and img2

/ / initialization processing

Sub vcl_init {

/ / call the vcl_init initialization subroutine to create a back-end host group, namely directors

New web_cluster = directors.round_robin ()

/ / use the new keyword to create a drector object and use the round_robin algorithm

Web_cluster.add_backend (web1)

/ / add a backend server node

Web_cluster.add_backend (web2)

New img_cluster = directors.random ()

/ / create a second cluster

Img_cluster.add_backend (img1,2)

Add a backend server node and set the weight value

Img_cluster.add_backend (img2,5)

}

/ / distribute to different backend host groups according to different access domain names

Sub vcl_recv {

If (req.http.host ~ "(? I) ^ (www.)? benet.com$") {

If the request header is www.benet.com or benet.com

Set req.http.host = "www.benet.com"

Set req.backend_hint = web_cluster.backend (); / / Select backend

}

Elsif (req.http.host ~ "(? I) ^ images.benet.com$") {

Set req.backend_hint = img_cluster.backend ()

}

Explanation: the "I" in "I" means to ignore case. (? I) means to turn on ignore case, while (?-I) means to turn off ignore case.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.