Ultra-detailed configuration and parsing of varnish4.0 cache proxy 04/16 Update SLTechnology News&Howtos

Ultra-detailed configuration and parsing of varnish4.0 cache proxy

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Blogger QQ:819594300

Blog address: http://zpf666.blog.51cto.com/

Friends who have any questions can contact the blogger, the blogger will help you answer, thank you for your support!

1. Varnish principle:

1) introduction to Varnish:

Varnish cache is not only the web application accelerator, but also acts as the http reverse cache proxy. You can install varnish on the front end of any http and configure it to cache content. Compared with the traditional squid, varnish has many advantages, such as higher performance, faster speed, more convenient management and so on. Some enterprises have used it as an alternative to the old version of squid in the production environment to provide better caching effect at the same server cost, and Varnish is one of the optional services for CDN caching server.

According to the official website, the main features of Varnish are as follows: https://www.varnish-cache.org/

1. Cache location: you can use memory or disk. SSD is recommended as RAID1 if you want to use disk.

two。 Log storage: logs are also stored in memory. Storage strategy: fixed size, recycling

3. Support the use of virtual memory.

4. There is a precise time management mechanism, that is, the time attribute control of the cache.

5. State engine architecture: different caches and proxy data are processed on different engines. Different control statements can be designed through a specific configuration language to determine that the data is cached in different places in different ways, and the packets passed by are processed by specific rules in a specific place.

6. Cache management: manage cache data in binary heap format to clean up the data in a timely manner.

2) comparison between Varnish and Squid

Similarities:

It's a reverse proxy server.

It's all open source software.

Advantages of Varnish:

1. The stability of Varnish is very high, and the probability of failure of Squid server is higher than that of Varnish when they complete the work of the same load, because Squid requires frequent restarts.

2. Varnish access speed is faster, because using "Visual Page Cache" technology, all cached data is read directly from memory, while squid is read from hard disk, so Varnish access speed is faster.

3. Varnish can support more concurrent connections, because Varnish's TCP connections are released faster than Squid, so it can support more TCP connections in the case of high concurrent connections.

4. Varnish can use regular expressions to batch clear part of the cache through the management port, but Squid cannot.

Squid belongs to a single process using a single-core CPU, but Varnish opens multiple processes in the form of fork for processing, so all cores can be reasonably used to process the corresponding requests.

Varnish's disadvantages:

1. Once the varnish process is Crash or restarted, the cached data will be completely released from memory, and all requests will be sent to the back-end server. In the case of high concurrency, it will cause great pressure on the back-end server.

2. In the use of varnish, if the request of a single url is load balanced through HA/F5, each request falls into a different varnish server, causing the request to be penetrated to the backend; and the same request cached on multiple servers will also cause a waste of varnish cache resources and cause performance degradation

Solutions to Varnish disadvantages:

For disadvantage 1: in the case of heavy traffic, it is recommended to use varnish's in-memory cache to start, and multiple squid/nginx servers are needed. Mainly in order to prevent the previous varnish service and server from being restarted, a large number of requests penetrate varnish, so that squid/nginx can act as the second layer CACHE, and it also makes up for the problem that varnish cache will be released when restart in memory.

For the second disadvantage: url hash can be done on load balancer, so that a single url request can be routed to a varnish server.

3) the principle of using varnish as web proxy cache:

Varnish is a cache of http reverse proxies. It receives the request from the client and then attempts to get the data from the cache to respond to the request from the client. If varnish cannot get the data from the cache to respond to the client, it will forward the request to the back end (backendservers), get the response and store it, and finally deliver it to the client.

If varnish has cached a response, it is much faster than your traditional back-end server, so you need to get as many requests as possible directly from varnish's cache.

Varnish decides whether to cache the content or get the response from the back-end server. The back-end server can synchronize the varnish cache content through the Cache-Control in the http response header. Under certain conditions, varnish will not cache content, the most common being the use of cookie. When a client-side web request is marked with cookie, varnish defaults to not caching. Many of these varnish features can be changed by writing vcl.

5) simple architecture:

Varnish is divided into management process and child process.

Management process: manages the child processes, compiles the VCL configuration, and applies it to different state engines.

Child process: generates a thread pool that processes user requests and returns user results through hash lookups.

6) the main configuration part of varnish:

Varnish configuration is mainly divided into: back-end configuration, ACL configuration, probes configuration, directors configuration, core subroutine configuration. Among them, the back-end configuration is necessary, and directors configuration and core subroutine configuration are also used in multiple servers.

Backend configuration: add an anti-generation server node to varnish, and configure at least one.

ACL configuration: that is, add access control lists to the varnish, which can be specified or prohibited.

Probes configuration: add a rule to varnish to detect whether the backend server is normal, so that it is convenient to switch or disable the corresponding backend server.

Directors configuration: add load balancing mode to varnish to manage multiple backend servers.

Core subroutine configuration: add back-end server switching, request caching, access control, error handling and other rules to varnish.

7) built-in default variables in VCL: variables (also known as object):

Req:Therequest object, the variable available when the request arrives (the request object sent by the client)

Bereq:Thebackend request object, the variable available when requesting from the backend host

Beresp:Thebackend response object, the variable available when getting content from the backend host (backend response request object)

Resp:TheHTTP response object, the variables available when responding to the client (the response object returned to the client)

Obj: available variables related to object properties when stored in memory (cached object, cached back-end response request content)

The preset variables are fixed by the system and are generated after the request is entered into the corresponding VCL subroutine. These variables can be easily extracted by the subroutine, and of course, some global variables can be customized.

Current time:

Now: function: returns the current timestamp.

Client: (client basic information)

Client.ip: returns the client IP address.

Note: the original client.port has been deprecated. If you want to use std.port (client.ip) for the client request port number, you need import std; before you can use std.

Client.identity: used to load the client identification code.

Server: (server basic information)

Note: the original server.port has been deprecated. If you want to use std.port (server.ip) for the server port number, you need import std; to use std.

Server.hostname: server hostname.

Server.identity: server identity.

Server.ip: returns the server-side IP address.

Req: (request object sent by the client)

Req: the entire HTTP request data structure

Req.backend_hint: specify the request backend node, and bereq.backend can obtain the configuration data of the backend node only after setting it.

Req.can_gzip: whether the client accepts GZIP transport encoding.

Req.hash_always_miss: whether to force not to hit the cache. If set to true, the cache will not be hit and new data will always be fetched from the backend.

Req.hash_ignore_busy: ignore busy objects in the cache and avoid deadlocks when there are multiple caches.

Req.http: the header corresponding to the request HTTP.

Req.method: request type (such as GET, POST).

Req.proto: the version of HTTP protocol used by the client.

Req.restarts: the number of restarts. The default maximum value is 4

Req.ttl: the cache has time remaining.

Req.url: the requested URL.

Req.xid: unique ID.

Bereq: (request object sent to the backend, based on req object)

Bereq: the entire backend post-request data structure.

Bereq.backend: the requested backend node configuration.

Bereq.between_bytes_timeout: the wait time (in seconds) between each byte received from the backend.

Bereq.connect_timeout: connection backend wait time (seconds), maximum wait time.

Bereq.first_byte_timeout: wait for the first byte of the backend (in seconds), the maximum waiting time.

Bereq.http: corresponds to the header information sent to the backend HTTP.

Bereq.method: the type of request sent to the backend (e.g. GET, POST).

Bereq.proto: the HTTP version of the request sent to the backend.

Bereq.retries: the same request retry count.

Bereq.uncacheable: no cache for this request.

Bereq.url: the URL sent to the backend request.

Bereq.xid: request a unique ID.

Beresp: (backend response request object)

Beresp: the entire backend responds to the HTTP data structure.

Beresp.backend.ip: the IP of the backend response.

Beresp.backend.name: the name of the configuration node in response to the backend.

Beresp.do_gunzip: the default is false. Extract the object before caching

Beresp.do_gzip: the default is false. Compress the object before caching

Beresp.grace: set an extra grace time after the expiration of the current object cache, which is used for special requests to increase the cache time. When the concurrency is large, it is not easy to set too much or the cache will be blocked. Generally, it can be set to about 1m, but this value is not valid when beresp.ttl=0s.

Beresp.http: the corresponding HTTP request header

Beresp.keep: object cache with hold time

Beresp.proto: the HTTP version of the response

Beresp.reason: HTTP status information returned by the server

Beresp.status: the status code returned by the server

Beresp.storage_hint: specifies the specific storage to save

Beresp.ttl: the remaining time of the object cache, specifying the remaining time of the unified cache.

Beresp.uncacheable: inherits bereq.uncacheable and does not cache

OBJ: (cache object, cache backend response request content)

Obj.grace: extra grace time for this object

Obj.hits: the number of cache hits. The counter starts from 1. When the object cache value is 1, it can be used to determine whether there is a cache. If the current value is greater than 0, there is a cache.

Obj.http: the header of the corresponding HTTP

Obj.proto:HTTP version

Obj.reason: HTTP status information returned by the server

Obj.status: the status code returned by the server

Obj.ttl: time remaining for this object cache (seconds)

Obj.uncacheable: objects are not cached

Resp: (response object returned to the client)

Resp: the entire response HTTP data structure.

Resp.http: the header of the corresponding HTTP.

Resp.proto: edit the HTTP protocol version of the response.

Resp.reason: the HTTP status information to be returned.

Resq.status: the HTTP status code to be returned.

Storage:

Storage..free_space: stores free space (in bytes).

Storage..used_space: storage has used space (in bytes).

Storage..happy: stores health status.

8. Specific functional statements

Ban (expression): clears the specified object cache

Call (subroutine): call subroutines, such as call (name)

Hash_data (input): generates the hash key, which is used to develop the hash key value generation structure, and can only be used in the vcl_ hash subroutine. When hash_data (input) is called, the hash is the cached hash key of the current page, and no other acquisition or operation is required, such as:

Subvcl_hash {

Hash_data (client.ip)

Return (lookup)

}

Note: return (lookup); is the default return value, so you don't have to write it.

New (): create a vcl object that can only be used in the vcl_ in subroutine.

Return (): ends the current subroutine and specifies to continue with the next action, such as return (ok); each subroutine can specify a different action.

Rollback (): restores the HTTP header to its original state, has been deprecated, and uses std.rollback () instead.

Synthetic (STRING): a synthesizer that is used to customize a response content. For example, when a request goes wrong, it can return custom 404 content instead of just the default header information. It can only be used in vcl_synth and vcl_backend_error subroutines, such as:

Subvcl_synth {

/ / Custom content

Synthetic ({"

Error

This is just a test of custom response exception content

"})

/ / deliver custom content only

Return (deliver)

Regsub (str,regex, sub): replace the string that occurs for the first time with a regular, the first parameter is the string to be processed, the second parameter is the regular expression, and the third is replaced with the string.

Regsuball (str,regex, sub): replaces all matching strings with regularities. The parameters are the same as regsuball.

For more information on specific variables, please see:

Https://www.varnish-cache.org/docs/4.0/reference/vcl.html#reference-vcl

(9) return statement:

The return statement terminates the subroutine and returns the action, and all actions are selected according to different VCL subroutine restrictions.

Https://www.varnish-cache.org/docs/4.0/users-guide/vcl-built-in-subs.html

Syntax: return (action)

Common actions:

Abandon discards processing and generates an error.

Deliver delivery processing

Fetch fetches the response object from the backend

Hash hash cache processing

Lookup lookup cache object

Ok continues to execute

Pass enters pass non-cache mode

Pipe enters pipe non-cache mode

Purge clears the cache object and builds the response

Restart starts over

Retry retry backend processing

Synth (statuscode,reason) composition returns client status information

10) the built-in subprograms in varnish include:

Note: all varnish built-in subroutines have their own defined return action return (action); different actions will call the corresponding next subroutine.

Vcl_ recv subroutine:

Start processing the request through return (action); select varnish processing mode and enter hash cache mode by default (that is, return (hash);). The cache time is the configuration item default_ttl (default is 120s) and the expiration retention time is default_grace (default is 10 seconds). The subroutine is generally used for mode selection, request object cache and information modification, back-end node modification, termination request and other operations.

Operable objects: (some or all values)

Read: client,server,req,storage

Write: client,req

Return value:

Synth (statuscode,reason); defines the response content.

Pass enters pass mode and enters the vcl_ pass subroutine.

Pipe enters pipe mode and enters the vcl_ Pope subroutine.

Hash enters hash cache mode and enters the vcl_ hash subroutine, which returns the value by default.

Purge clears the cache and other data, and the subroutine starts from vcl_hash and then to vcl_purge.

Vcl_ Pope subroutine:

Pipe mode processing, which is mainly used to directly fetch the back-end response content and return it to the client, which can be defined to return the response content to the client. The subroutine is generally used for the back-end information that needs to be timely and not processed. The response content of the back-end is taken out and delivered directly to the client without entering the vcl_ delivery subroutine for processing.

Operable objects: (some or all values)

Read: client,server,bereq,req,storage

Write: client,bereq,req

Return value:

Synth (statuscode,reason); defines the response content.

Pipe continues pipe mode, enters the back-end vcl_backend_ fetch subroutine, and returns the value by default.

Vcl_ pass subroutine:

Pass mode processing, which is similar to the hash cache mode and only does no cache processing.

Operable objects: (some or all values)

Read: client,server,req,storage

Write: client,req

Return value:

Synth (statuscode,reason); defines the response content.

Fetch continues pass mode, enters the back-end vcl_backend_ fetch subroutine, and returns the value by default.

Vcl_ it subroutine:

Hash caching mode is called when there is a hash cache for cache processing, and the cache can be discarded or modified.

Operable objects: (some or all values)

Read: client,server,obj,req,storage

Write: client,req

Return value:

Restart restart request.

Deliver delivers the cached content, which is processed by the vcl_ deliver subroutine. The default value is returned.

Synth (statuscode,reason); defines the response content.

Vcl_ miss subroutine:

When there is no hash cache mode, it is called when there is no hash cache. It is used for judging the choice to enter the backend to get the response content, which can be modified to pass mode.

Operable objects: (some or all values)

Read: client,server,req,storage

Write: client,req

Return value:

Restart restart request.

Synth (statuscode,reason); defines the response content.

Pass switches to pass mode and enters the vcl_ pass subroutine.

Fetch normally fetches the backend content and then caches it, and enters the vcl_backend_ fetch subroutine. The default value is returned.

Vcl_ hash subroutine:

Hash cache mode, generate hash value as cache lookup key name to extract cache content, mainly used for cache hash key value processing, you can use hash_data (string) to specify key value composition structure, you can generate different cache key values on the same page through IP or cookie.

Operable objects: (some or all values)

Read: client,server,req,storage

Write: client,req

Return value:

Lookup looks for cache objects. There is a cache entry into the vcl_ it subroutine. There is no cache entry into the vcl_ miss subroutine. When the purge cleanup mode is used, the vcl_ purge subroutine is entered, and the default value is returned.

Vcl_ purge subroutine:

The cleanup mode, which is cleared and called when the corresponding cache is found, is used to request the method to clear the cache and report.

Operable objects: (some or all values)

Read: client,server,req,storage

Write: client,req

Return value:

Synth (statuscode,reason); defines the response content.

Restart restart request.

Vcl_ delivery subroutine:

Client delivery subroutine, called after vcl_backend_ response subroutine (non-pipe mode), or vcl_ it subroutine, can be used to append response header information, cookie and so on.

Operable objects: (some or all values)

Read: client,server,req,resp,obj,storage

Write: client,req,resp

Return value:

Deliver delivers the backend or cached response content normally, and the default value is returned.

Restart restart request.

Vcl_backend_ fetch subroutine:

Called before sending a back-end request, can be used to change the request address or other information, or to abandon the request.

Operable objects: (some or all values)

Read: server,bereq,storage

Write: bereq

Return value:

Fetch normally sends a request to the backend to retrieve the response content, enters the vcl_backend_ responses subroutine, and returns the value by default.

Abandon abandons the back-end request and generates an error to enter the vcl_backend_ error subroutine.

Vcl_backend_ response subroutine:

Called after the backend responds, which can be used to modify the cache time and cache related information.

Operable objects: (some or all values)

Read: server,bereq,beresp,storage

Write: bereq,beresp

Return value:

Deliver delivers the back-end response content normally, enters the vcl_ delivery subroutine, and returns the value by default.

Abandon abandons the back-end request and generates an error to enter the vcl_backend_ error subroutine.

Retry retries the backend request, and the retry counter is incremented by 1. When the max_ values in the configuration are exceeded, an error will be reported and the vcl_backend_ error subroutine will be entered.

Vcl_backend_ error subroutine:

Back-end processing failed to call, exception page display effect handling, you can customize the error response content, or modify beresp.status and beresp.http.Location redirection.

Operable objects: (some or all values)

Read: server,bereq,beresp,storage

Write: bereq,beresp

Return value:

Deliver only delivers sysnthetic (string) custom content, and the backend exception standard error content is returned by default.

Vcl_ synth subroutine:

Customize the response content. You can call through synthetic () and the return value synth, where you can customize the exception display or modify the resp.status and resp.http.Location redirection.

Operable objects: (some or all values)

Read: client,server,req,resp,storage

Write: req,resp

Return value:

Deliver only delivers sysnthetic (string) custom content. Sysnth exception specified status code and error content are returned by default.

Restart restart request.

Vcl_ in subroutine:

Called first when vcl is loaded to initialize VMODs. The subroutine does not participate in request processing and is called only once when the vcl is loaded.

Operable objects: (some or all values)

Read: server

Write: none

Return value:

Ok returns normally, enter the vcl_ recv subroutine, and return the value by default.

Vcl_ Fini subroutine:

Called when uninstalling the current vcl configuration to clean up the VMODs. The subroutine does not participate in request processing and is called only after the vcl is discarded normally.

Operable objects: (some or all values)

Read: server

Write: none

Return value:

Ok returns normally. Vcl will be released this time. The default value is returned.

The varnishsubroutine invokes the flowchart and proceeds to the next step through the return return value of most subroutines:

11) elegant mode (Garcemode)

Request merge in Varnish

When several clients request the same page, varnish sends only one request to the back-end server, and then lets several other requests suspend and wait for the result to be returned; after obtaining the result, other requests copy the back-end result and send it to the client

But if there are thousands of requests at the same time, the waiting queue will become large, which will lead to two types of potential problems:

Thundering herd problem, in which a large number of threads are suddenly released to replicate the results returned by the backend, will cause the load to rise rapidly; no user likes to wait

To solve this problem, you can configure varnish to retain the cache object for a period of time after the cache object expires due to a timeout, so as to return the past file contents (stalecontent) to those waiting requests. The configuration example is as follows:

Subvcl_recv {

If (! Req.backend.healthy) {

Setreq.grace = 5m

} else {

Setreq.grace = 15s

}

Subvcl_fetch {

Setberesp.grace = 30m

}

The above configuration means that varnish will retain the invalid cache object for another 30 minutes, which is equal to the maximum req.grace value.

According to the health status of the backend CVM, varnish can provide expired content within 5 minutes or 15 seconds to the frontend request.

Second, install varnish

The virtual machine environment is as follows:

1. Install the dependency software package (Note: use centos online yum source)

2. Install varnish

The official website of varnish is http://varnish-cache.org, where you can download the latest version of the software.

Download address:

Https://www.varnish-cache.org/content/varnish-cache-403

Note: Varnish sites are sometimes walled.

Git download: git clone https://github.com/varnish/Varnish-Cache / var/tmp/

Decompress and enter the decompression directory to compile and install:

Note:

. / autogen.sh

If you only need to run the installation package downloaded from the Git library, it is used to generate the configure compilation file.

Configure, compile, and install:

Note: the installation path is not specified and is installed in the / usr/local directory by default.

Upload my numbered default.vcl file (configuration below)

The default vatnish does not have a configuration file, so we need to configure it manually:

Now let's add the configuration file manually:

Import the numbered default.vcl file you just uploaded here:

The contents of the specific default.vcl file are as follows:

Vcl4.0

Importdirectors

Importstd

Probebackend_healthcheck {

.url = "/"

.interval = 5s

.timeout = 1s

.window = 5

.threshold = 3

}

Backendweb_app_01 {

.host = "192.168.1.11"

.port = "80"

.first _ byte_timeout = 9s

.connect _ timeout = 3s

.between _ bytes_timeout = 1s

.probe = backend_healthcheck

}

Backendweb_app_02 {

.host = "192.168.1.12"

.port = "80"

.first _ byte_timeout = 9s

.connect _ timeout = 3s

.between _ bytes_timeout = 1s

.probe = backend_healthcheck

}

Aclpurgers {

"127.0.0.1"

"localhost"

"192.168.1.0 Compact 24"

}

Subvcl_init {

New web = directors.round_robin ()

Web.add_backend (web_app_01)

Web.add_backend (web_app_02)

}

Subvcl_recv {

Set req.backend_hint = web.backend ()

If (req.method = = "PURGE") {

If (! client.ip ~ purgers) {

Return (synth (405, "NotAllowed.")

}

Return (purge)

}

If (req.method! = "GET" & &

Req.method! = "HEAD" & &

Req.method! = "PUT" & &

Req.method! = "POST" & &

Req.method! = "TRACE" & &

Req.method! = "OPTIONS" & &

Req.method! = "PATCH" & &

Req.method! = "DELETE") {

Return (pipe)

}

If (req.method! = "GET" & & req.method! = "HEAD") {

Return (pass)

}

If (req.url ~ "\. (php | asp | aspx | jsp | do | ashx | shtml) ($|\?) {

Return (pass)

}

If (req.http.Authorization) {

Return (pass)

}

If (req.http.Accept-Encoding) {

If (req.url ~ "\. (bmp | png | gif | jpg | jpeg | ico | tgz | bz2 | tbz | zip | rar | mp3 | mp4 | ogg | swf | flv) $") {

Unsetreq.http.Accept-Encoding

} elseif (req.http.Accept-Encoding ~ "gzip") {

Set req.http.Accept-Encoding = "gzip"

} elseif (req.http.Accept-Encoding ~ "deflate") {

Set req.http.Accept-Encoding = "deflate"

} else {

Unset req.http.Accept-Encoding

}

If (req.url ~ "\. (css | js | html | htm | bmp | png | gif | jpg | jpeg | ico | gz | tgz | bz2 | tbz | zip | rar | mp3 | mp4 | ogg | swf | flv) ($|\)") {

Unset req.http.cookie

Return (hash)

}

If (req.restarts = = 0) {

If (req.http.X-Forwarded-For) {

Set req.http.X-Forwarded-For = req.http.X-Forwarded-For + "," + client.ip

} else {

Set req.http.X-Forwarded-For = client.ip

}

Return (hash)

}

Subvcl_hash {

Hash_data (req.url)

If (req.http.host) {

Hash_data (req.http.host)

} else {

Hash_data (server.ip)

}

Return (lookup)

}

Subvcl_hit {

If (req.method = = "PURGE") {

Return (synth (200, "Purged.")

}

Return (deliver)

}

Subvcl_miss {

If (req.method = = "PURGE") {

Return (synth (404, "Purged.")

}

Return (fetch)

}

Subvcl_deliver {

If (obj.hits > 0) {

Set resp.http.X-Cache = "HIT"

Set resp.http.X-Cache-Hits = obj.hits

} else {

Set resp.http.X-Cache = "MISS"

}

Unset resp.http.X-Powered-By

Unset resp.http.Server

Unset resp.http.X-Drupal-Cache

Unset resp.http.Via

Unset resp.http.Link

Unset resp.http.X-Varnish

Set resp.http.xx_restarts_count = req.restarts

Set resp.http.xx_Age = resp.http.Age

Set resp.http.hit_count = obj.hits

Unset resp.http.Age

Return (deliver)

}

Subvcl_pass {

Return (fetch)

}

Subvcl_backend_response {

Set beresp.grace = 5m

If (beresp.status== 499 | | beresp.status== 404 | | beresp.status== 502) {

Set beresp.uncacheable = true

}

If (bereq.url ~ "\. (php | jsp) (\? | $)") {

Set beresp.uncacheable = true

} else {

If (bereq.url ~ "\. (css | js | html | htm | bmp | png | gif | jpg | jpeg | ico) ($|\)") {

Set beresp.ttl = 15m

Unset beresp.http.Set-Cookie

} elseif (bereq.url ~ "\. (gz | tgz | bz2 | tbz | zip | rar | mp3 | mp4 | ogg | swf | flv) ($|\)") {

Set beresp.ttl = 30m

Unset beresp.http.Set-Cookie

} else {

Set beresp.ttl = 10m

Unset beresp.http.Set-Cookie

}

Return (deliver)

}

Subvcl_purge {

Return (synth, "success")

}

Subvcl_backend_error {

If (beresp.status = = 500 | |

Beresp.status = = 501 | |

Beresp.status = = 502 | |

Beresp.status = = 503 | |

Beresp.status = 504) {

Return (retry)

}

Subvcl_fini {

Return (ok)

}

III. Analysis of varnish examples

Varnish configuration is basically editing the VCL (Varnish Configuration Language) file. Varnish has a set of custom VCL syntax. When starting, the configuration file will be compiled into C language and then executed.

Starting with varnish 4. 0, each VCL file must declare its version "vcl 4.0;" on the start line.

Blocks (subroutines) are separated by curly braces and statements end with semicolons. All keywords and preset subroutine names are all lowercase.

Note: you can also use / * when multiple lines are supported. , /

1. Backend server address pool configuration and backend server health check

Varnish has the concept of a "back end" or "source" server. Backend server provides accelerated content to varnish. In effect, you add an accessible web server to varnish, and if you have multiple web servers, you can add multiple backend blocks.

1) backend server definition:

Command: backend. This definition is the most basic reverse entry definition, which is used for varnish connections to the corresponding server. if there is no definition or misdefinition, the user cannot access the normal page.

Syntax format:

Backendname {

.attribute = "value"

}

Note: backend is the definition of the backend keyword, and name is the alias of the current backend node. When there are multiple backend nodes, the name name cannot be duplicated, otherwise it can be overwritten. The curly braces define the attributes related to the current node (key = value). Varnish cannot be started unless there must be a call after a node other than the default node is defined. Whether the backend is normal can be judged by std.healthy (backend).

Operators are supported:

= (assignment operation)

= (equality comparison)

~ (match, you can use regular expressions, or access control lists)

! ~ (mismatch, you can use regular expressions, or access control lists)

! (not)

& (logic and)

| | (logical or) |

List of attributes:

.host = "xxx.xxx.xxx.xxx"; / / to go to the IP or domain name of the host (that is, the backend host), a key / value pair is required.

.port = "8080"; / / Host connection port number or protocol name (HTTP, etc.). Default is 80.

.host _ header=''; / / ask for additional content in the host header

.connect _ timeout=1s; / / timeout of the connection backend

.first _ byte_timeout=5s; / / wait for the first byte returned from the backend

.between _ bytes_timeout=2s; / / wait time per byte received

.probe = probe_name; / / monitors the status of backend hosts and specifies external monitoring name or internal direct addition

.max _ connections=200; / / sets the maximum number of concurrent connections, beyond which the connection will fail

Example: (the results of the following two examples are the same, but the second example is more suitable for clusters and can be easily modified in batches.)

Backendweb {

.host = "192.168.1.11"

.port = "80"

.probe = {/ / append the monitoring block directly. Probe is a parameter

.url = "/"

.timeout = 2s

}

Probeweb_probe {/ / Monitoring must be defined in front, otherwise the backend call cannot find the monitor block.

.url = "/"

.timeout = 2s

}

Backendweb {

.host = "192.168.1.12"

.port = "80"

.probe = web_probe; / / call external shared monitoring block

}

2) Monitor definition:

Command: probe. Monitoring can iterate through the specified address and determine whether the server is idle or normal by the response time. This type of command is ideal for some node servers in a cluster that crash or are overloaded and disable access to this node server.

Syntax format:

Probename {

.attribute = "value"

}

Note: probe is the definition of monitoring keyword, and name is the alias of the current monitoring point. When there are multiple monitoring nodes, the name name cannot be duplicated, otherwise it can be overwritten. The curly braces define the attributes related to the current node (key = value).

There is no required attribute, because the default value can perform the operation normally.

List of attributes:

.url = "/"; / / specify the URL address of the monitoring entry. The default is "/".

.request = ""; / / specify the entry address of the monitoring request, which is higher than .url.

.request _ response= "200"; / / request response code. The default is 200.

.timeout = 2s; / / request timeout.

.interval = 5s; / / the interval between each polling request. The default is 5s.

.initial =-1; / / the back-end server node can only be used after several good times of .window polling at the initial startup. If the default is-1, then all the times of polling in .window are judged to be normal.

.window = 8; / / specifies the number of polling times, which is used to determine that the server is normal, and the default is 8.

.threshold = 3; / / how many times it must be polled before the backend node server is normal. The default is 3.

Example: create a health monitor and define the name of the health check as backend_healthcheck

Probebackend_healthcheck {

.url = "/"

.timeout = 1s

.interval = 5s

.window = 5

.threshold = 3

}

In the above example, varnish detects the backend every 5 seconds, and the timeout is set to 1 second. Each test will send a request for get /. If more than 3 of the five tests are successful, varnish believes that the backend is healthy, otherwise, there is a problem with the backend.

3) Cluster load balancer directors:

Varnish can define multiple backends, or it can put several backends in a backend cluster to achieve load balancing.

You can also combine several backends into a set of backends. This group is called Directors. It can improve performance and elasticity.

Directors is a varnish load balancing module. Directors module must be introduced before use. Directors module mainly includes: round_robin,random,hash,fallback load balancing mode.

Round_robin: the loop selects the back-end servers one by one.

Random: select the backend server randomly, and you can set each backend weight to increase the probability.

Hash: randomly select the corresponding backend server by hashing and maintain the corresponding relationship, and find the corresponding backend server directly next time.

Fallback: backup

Note: random,hash has a weight value setting, which is used to increase the probability. It is best to configure a monitor for each backend (the backend server monitors normally) so that the directors automatically shields the abnormal backend from entering the balanced queue.

These operations require you to load VMOD (varnish module) and then call the VMOD in vcl_init.

Importdirectors; # load thedirectors

Backendweb1 {

.host = "192.168.1.11"

.port = "80"

.probe = backend_healthcheck

}

Backendweb2 {

.host = "192.168.1.12"

.port = "80"

.probe = backend_healthcheck

}

/ / initialization processing

Subvcl_init {/ / calls the vcl_init initialization subroutine to create a back-end host group, namely directors

New web_cluster = directors.round_robin (); / / create a drector object using the new keyword, using the round_robin algorithm

Web_cluster.add_backend (web1); / / add a backend server node

Web_cluster.add_backend (web2)

}

/ / start processing the request

Subvcl_recv {/ / calls the vcl_ recv subroutine to receive and process requests

Set req.backend_hint = web_cluster.backend (); / / Select backend

}

Description:

The set command is to set variables

The unset command deletes a variable

Web_cluster.add_backend (backend, real); add the backend server node, backend is the backend configuration alias, real is the weight value, and the random probability formula: 100 * (current weight / total weight).

Req.backend_hint is a predefined variable of varnish that specifies the request backend node.

Vcl objects need to be created using the new keyword, all creatable objects are internal, import is required before use, and all new operations can only be performed in the vcl_ in subroutine.

Extension: varnish sends different url to different backend server

Importdirectors; # load thedirectors

Backendweb1 {

.host = "192.168.1.11"

.port = "80"

.probe = backend_healthcheck

}

Backendweb2 {

.host = "192.168.1.12"

.port = "80"

.probe = backend_healthcheck

}

Backendimg1 {

.host = "img1.lnmmp.com"

.port = "80"

.probe = backend_healthcheck

}

Backendimg2 {

.host = "img2.lnmmp.com"

.port = "80"

.probe = backend_healthcheck

}

/ / initialization processing

Subvcl_init {/ / calls the vcl_init initialization subroutine to create a back-end host group, namely directors

New web_cluster = directors.round_robin (); / / create a drector object using the new keyword, using the round_robin algorithm

Web_cluster.add_backend (web1); / / add a backend server node

Web_cluster.add_backend (web2)

Newimg_cluster = directors.random ()

Img_cluster.add_backend (img1,2); / / add a backend server node and set the weight value

Img_cluster.add_backend (img2,5)

}

/ / distribute to different backend host groups according to different access domain names

Subvcl_recv {

If (req.http.host ~ "(? I) ^ (www.)? benet.com$") {

Set req.http.host = "www.benet.com"

Set req.backend_hint = web_cluster.backend (); / / Select backend

} elsif (req.http.host ~ "(? I) ^ paired walled picpaths.benet.com $") {

Set req.backend_hint = img_cluster.backend ()

}

Explanation: the "I" in "I" means to ignore case. (? I) means to turn on ignore case, while (?-I) means to turn off ignore case.

4) access Control list (ACL)

Create an address list for later judgment, which can be a domain name or an IP collection. This can be used to specify certain address request entries, to prevent malicious requests, and so on.

Syntax format:

Acl purgers {

"127.0.0.1"

"localhost"

"192.168.1.0 Compact 24"

! "192.168.1.100"

}

Note: acl is the access list keyword (lowercase required), name is the alias of the list for calling, and the inside of the curly braces is the address set.

Note: if the list contains an unresolvable host address, it will match any address.

If you don't want it to match, you can add one before it! Symbol, as above! "192.168.1.100"

To use ACL, you only need to use the matching operator ~ or! ~ such as:

Sub vcl_recv {

If (req.method = = "PURGE") {/ / PURGE request processing

If (client.ip ~ purgers) {

Return (purge)

} else {

Return (synth (403, "Accessdenied.")

}

5) configuration of cache rules:

Subvcl_recv {

/ / processing of PURGE request

If (req.method = = "PURGE") {

If (! client.ip ~ purgers) {

Return (synth (405, "NotAllowed.")

}

Return (purge)

}

Set req.backend_hint = web.backend ()

/ send php, asp and other dynamic content access requests directly to the back-end server without caching.

If (req.url ~ "\. (php | asp | aspx | jsp | do | ashx | shtml) ($|\?) {

Return (pass)

}

/ / send non-GET and HEAD access requests directly to the backend server without caching. For example, a POST request.

If (req.method! = "GET" & & req.method! = "HEAD") {

Return (pass)

}

/ / if varnish sees a 'Authorization' header in the header, it will pass the request.

If (req.http.Authorization) {

Return (pass)

}

/ / GET requests with cookie headers are also cached

If (req.url ~ "\. (css | js | html | htm | bmp | png | gif | jpg | jpeg | ico | gz | tgz | bz2 | tbz | zip | rar | mp3 | mp4 | ogg | swf | flv) ($|\)") {

Unset req.http.cookie

Return (hash)

}

Note: by default, varnish does not cache objects with Set-Cookie in the http header of the response from the backend. If the client sends a request with Cookie header,varnish, it ignores the cache and passes the request directly to the backend.

/ / add the X-Forward-For header to the request sent to the backend host, and add the X-Forwarded-For header information for the first visit, so that the back-end program can obtain the client ip instead of the varnish address.

If (req.restarts = = 0) {

If (req.http.x-forwarded-for) {/ / if this header has been set, it should be appended again separated by commas

Set req.http.X-Forwarded-For = req.http.X-Forwarded-For + "," + client.ip

} else {/ / if there is only one layer of proxy, there is no need to set it

Set req.http.X-Forwarded-For = client.ip

}

Note: X-Forwarded-For is a HTTP request header field used to identify the original IP address of a client connected to a Web server through a HTTP proxy or load balancer.

Subroutine:

A subroutine is a function similar to C, but the program has no calling parameters, and the subroutine is defined by the sub keyword. In VCL, subroutines are used for management programs.

Note: all VCL built-in programs start with vcl_ and have been preset. As long as the corresponding built-in subroutine is declared in the VCL file, it will be called in the corresponding process.

3. Complete configuration example of varnish

1. Topological environment

Varnish:192.168.1.9

Web01:192.168.1.11

Web02:192.168.1.12

Configure web01 and web02 as back-end servers

In the same way, apache2 does the same. In order to verify the aspect, the content of the page should not be the same.

Ensure that the varnish server can access web01 and web02 properly

Varnish cache proxy server configuration:

2. The configuration content of vcl file:

[root@varnish~] # cat / usr/local/var/varnish/default.vcl

# use the format of varnish version 4.

Vcl4.0

# load backend load balancer module

Importdirectors

# load std module

Importstd

# create a health check policy named backend_healthcheck

Probebackend_healthcheck {

.url = "/"

.interval = 5s

.timeout = 1s

.window = 5

.threshold = 3

}

# define backend server

Backendweb_app_01 {

.host = "192.1681.11"

.port = "80"

.first _ byte_timeout = 9s

.connect _ timeout = 3s

.between _ bytes_timeout = 1s

.probe = backend_healthcheck

}

Backendweb_app_02 {

.host = "192.168.1.12"

.port = "80"

.first _ byte_timeout = 9s

.connect _ timeout = 3s

.between _ bytes_timeout = 1s

.probe = backend_healthcheck

}

# define the IP that allows cache cleanup

Aclpurgers {

"127.0.0.1"

"localhost"

"192.168.1.0 Compact 24"

}

# vcl_init initialization subroutine to create a back-end host group

Subvcl_init {

New web = directors.round_robin ()

Web.add_backend (web_app_01)

Web.add_backend (web_app_02)

}

# request entry, used to receive and process requests. This is generally used for routing processing to determine whether to read the cache and to specify which backend to use for the request

Subvcl_recv {

# specify the request to use web backend cluster. Add .backend () to the cluster name

Setreq.backend_hint = web.backend ()

# match the request to clean the cache

If (req.method = = "PURGE") {

If (! client.ip ~ purgers) {

Return (synth (405, "NotAllowed.")

}

# if yes, perform a cleanup

Return (purge)

}

# if it is not a normal request, the penetration is non-negotiable.

If (req.method! = "GET" & &

Req.method! = "HEAD" & &

Req.method! = "PUT" & &

Req.method! = "POST" & &

Req.method! = "TRACE" & &

Req.method! = "OPTIONS" & &

Req.method! = "PATCH" & &

Req.method! = "DELETE") {

Return (pipe)

}

# if it's not GET and HEAD, jump to pass

If (req.method! = "GET" & & req.method! = "HEAD") {

Return (pass)

}

# if you match a dynamic content access request, skip to pass

If (req.url ~ "\. (php | asp | aspx | jsp | do | ashx | shtml) ($|\?) {

Return (pass)

}

# requests with authentication jump to pass

If (req.http.Authorization) {

Return (pass)

}

If (req.http.Accept-Encoding) {

If (req.url ~ "\. (bmp | png | gif | jpg | jpeg | ico | tgz | bz2 | tbz | zip | rar | mp3 | mp4 | ogg | swf | flv) $") {

Unset req.http.Accept-Encoding

} elseif (req.http.Accept-Encoding ~ "gzip") {

Set req.http.Accept-Encoding = "gzip"

} elseif (req.http.Accept-Encoding ~ "deflate") {

Set req.http.Accept-Encoding = "deflate"

} else {

Unset req.http.Accept-Encoding

}

If (req.url ~ "\. (css | js | html | htm | bmp | png | gif | jpg | jpeg | ico | gz | tgz | bz2 | tbz | zip | rar | mp3 | mp4 | ogg | swf | flv) ($|\)") {

Unset req.http.cookie

Return (hash)

}

# pass the real client IP to the backend server and use X-Forwarded-For to receive the backend server log

If (req.restarts = = 0) {

If (req.http.X-Forwarded-For) {

Set req.http.X-Forwarded-For = req.http.X-Forwarded-For + "," + client.ip

} else {

Set req.http.X-Forwarded-For = client.ip

}

Return (hash)

}

# hash event (cache event)

Subvcl_hash {

Hash_data (req.url)

If (req.http.host) {

Hash_data (req.http.host)

} else {

Hash_data (server.ip)

}

Return (lookup)

}

# Cache hit event

Subvcl_hit {

If (req.method = = "PURGE") {

Return (synth (200, "Purged.")

}

Return (deliver)

}

# Cache miss event

Subvcl_miss {

If (req.method = = "PURGE") {

Return (synth (404, "Purged.")

}

Return (fetch)

}

# the previous event returned to the user is usually used to add or remove headers

Subvcl_deliver {

If (obj.hits > 0) {

Set resp.http.X-Cache = "HIT"

Set resp.http.X-Cache-Hits = obj.hits

} else {

Set resp.http.X-Cache = "MISS"

}

# canceling the header of the php frame version

Unsetresp.http.X-Powered-By

# canceling the display of header headers such as web software version, Via (from varnish) for security

Unset resp.http.Server

Unset resp.http.X-Drupal-Cache

Unset resp.http.Via

Unset resp.http.Link

Unsetresp.http.X-Varnish

# display the number of times the request has experienced restarts events

Setresp.http.xx_restarts_count = req.restarts

# display the time unit per second of the resource cache

Setresp.http.xx_Age = resp.http.Age

# display the number of hits of the resource

Setresp.http.hit_count = obj.hits

# cancel the display of Age in order not to conflict with CDN

Unsetresp.http.Age

# return to the user

Return (deliver)

}

# pass event

Subvcl_pass {

Return (fetch)

}

# handling events that return results to the backend (setting cache, removing cookie information, setting header headers, etc.) are called automatically after the fetch event

Subvcl_backend_response {

# enabling grace mode means that when the backend is hung up, the maximum valid time for returning the cache resource to the user resource is 5 minutes, even if the cache resource has expired (exceeds the cache time).

Setberesp.grace = 5m

# if the backend returns the following error status code, it will not be cached

If (beresp.status== 499 | | beresp.status== 404 | | beresp.status== 502) {

Set beresp.uncacheable = true

}

# No caching if php or jsp is requested

If (bereq.url ~ "\. (php | jsp) (\? | $)") {

Set beresp.uncacheable = true

} else {/ / customize the cache duration of cache files, that is, TTL value

If (bereq.url ~ "\. (css | js | html | htm | bmp | png | gif | jpg | jpeg | ico) ($|\)") {

Set beresp.ttl = 15m

Unset beresp.http.Set-Cookie

} elseif (bereq.url ~ "\. (gz | tgz | bz2 | tbz | zip | rar | mp3 | mp4 | ogg | swf | flv) ($|\)") {

Set beresp.ttl = 30m

Unset beresp.http.Set-Cookie

} else {

Set beresp.ttl = 10m

Unset beresp.http.Set-Cookie

}

# return to the user

Return (deliver)

}

Subvcl_purge {

Return (synth, "success")

}

Subvcl_backend_error {

If (beresp.status = = 500 | |

Beresp.status = = 501 | |

Beresp.status = = 502 | |

Beresp.status = = 503 | |

Beresp.status = 504) {

Return (retry)

}

Subvcl_fini {

Return (ok)

}

3. Start varnish

There are two important parameters you must set when starting varnish: one is the tcp listening port that handles http requests, and the other is the back-end server that handles real requests.

Note: if you install varnish using the package management tool that comes with the operating system, you will find the startup parameters in the following file:

RedHat, Centos: / etc/sysconfig/varnish

1): the'- a 'parameter defines the address at which varnish listens and processes http requests. You may want to set this parameter on the well-known http 80 port.

Example:

-a: 80

-a localhost:80

-a 192.168.1.100purl 8080

-a'[fe80::1]: 80'

-a '0.0.0.0pur8080, [:]: 8081'

If your webserver and varnish are running on the same machine, you must change the listening address.

2):'- f 'VCL-file or'-b 'backend

-f add vcl file,-b define backend server

Varnish needs to know from × × to the http server that needs to be cached. You can specify it with the-b parameter, or you can put it in the vcl file and then use the-f parameter to specify.

Using-b at startup is a quick way.

-b192.168.1.2 purl 80

Note: if you specify name, the name must resolve to an IPv4 or IPv6 address

If you use the-f parameter, you can specify the vcl file in-f when you start.

The default varnish uses 100 MB of memory to cache objects. If you want to cache more, you can use the-s parameter.

Note: Varnish has a large number of useful command line arguments, so it is recommended to check its help

[root@varnish~] # / usr/local/sbin/varnishd-h

Start varnish

Firewall open port 80 exception:

2) now that varnish is up and running, you can access your Web application through varnish.

Open the Firefox browser

First visit:

Second visit:

Third visit:

3) varnish5 configuration manually clears the cache

Varnish5 configures clear caching through vcl

The vcl configuration allows the client to manually request a clear cache to ensure that local data is updated in time without having to restart the varnish server.

Configuration method:

# allow cached IP sets to be cleared

Aclpurgers {

"127.0.0.1"

"localhost"

"192.168.1.0 Compact 24"

}

Subvcl_recv {

……

If (req.method = = "PURGE") {

If (! client.ip ~ purgers) {

Return (synth (405, "NotAllowed.")

}

Return (purge); / / clear the cache

}

……

}

Subvcl_purge {

Return (synth, "success")

}

Open the Firefox browser and enter any cache page, as shown in the following figure.

Just use the screenshot above:

Click Edit and resend, modify the request type to PURGE, and then click send:

Check the return status, and if the cache is cleared successfully, you can press F5 to refresh the page and view the new content.

4) verify the health check

We simulate the apache1 outage of the backend:

Then browse to 192.168.1.9 to see:

No matter how you refresh the page, it has always been a web page provided by the apache2 web server, and there will be no stutter, which proves to be a success!

Now turn on apache1's httpd service again:

Check again:

From the picture above, you can see that apache1's web page has been browsed again, which once again shows that the health check is successful!

5) go to the back-end web server to check the httpd log to see if the address of the recorded client is the varnish proxy server or the real client IP.

As you can see from the above figure, the recorded ip is the ip of the varnish proxy server, so what if I want to see the real client ip from the log?

The solution is as follows:

Similarly, apache2 does the same thing.

Access the varnish from the client again, and then view the access log again:

Note: 192.168.1.5 is the ip address of my client.

As you can see from the above figure, what is recorded in the access log is no longer the ip address of varnish, but the real client ip address.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.