Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize regular matching in nginx

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you about how to achieve regular matching in nginx. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Today, the url before the site was crawled by Baidu search engine, and you need to make a 301 forced jump in the service (for example, access: www.baidu.com/kenni-1,www.baidu.com/kenni-1/, two unified jumps to www.baidu.com/kenni-1.html, and www.baidu.com/kenni-1?page=11, jump to www.baidu.com/kenni-1.html?page=11,kenni- followed by numbers)

Because the server is using nginx, add the following configuration to nginx:

# access domain name / kenni-10 jump to domain name / kenni-10.html

Rewrite ^ / kenni- ([0-9] +) $/ kenni-$1.html permanent

# access domain name / kenni-10/ jump to domain name / kenni-10.html

Rewrite ^ / kenni- ([0-9] +) / $/ kenni-$1.html permanent

Other learning reference materials:

1. ^: matches the starting position of the string

2. $: matches the end of the string

3. *. Matches any character, and the number of * matches is 0 to positive infinity

4. Slashes are used to escape,\. Match. Special usage, remember to remember

5. (value 1 | value 2 | value 3 | value 4): or match pattern, for example: (jpg | gif | png | bmp) matches jpg or gif or png or bmp

6. I is not case-sensitive

I. regular expressions match, where:

* ~ for case-sensitive matching

* ~ * for case-insensitive matching

*! ~ and! * are case-sensitive mismatch and case-insensitive mismatch, respectively

2. Match files and directories, where:

*-f and!-f are used to determine whether a file exists.

*-d and!-d are used to determine whether a directory exists.

*-e and!-e are used to determine whether a file or directory exists

*-x and!-x are used to determine whether the file is executable or not

3. The last parameter of the rewrite instruction is the flag tag, and the flag tag is:

1.last is equivalent to the [L] tag in apache and stands for rewrite.

2.break after the matching of this rule is completed, the matching is terminated and the subsequent rules are no longer matched.

3.redirect returns 302 temporary redirection, and the browser address displays the URL address after the jump.

4.permanent returns 301 permanent redirection, and the browser address displays the URL address after the jump.

URI rewriting is implemented using last and break, and the browser address bar remains unchanged. And there is a slight difference between the two: using the alias instruction must be marked with last; when using the proxy_pass instruction, you need to use the break tag. After the implementation of this rewrite rule, the Last tag will re-initiate the request for its server {.} tag, while the break tag will terminate the matching after the completion of the matching of this rule.

For example, if we redirect something like URL/photo/123456 to / path/to/photo/12/1234/123456.png

Rewrite "/ photo/ ([0-9] {2}) ([0-9] {2}) ([0-9] {2})" / path/to/photo/$1/$1 $2 Universe 1 $2 $3.png

IV. Instructions related to NginxRewrite rules

1.break instruction

Use environment: server,location,if

The purpose of this instruction is to complete the current rule set and no longer process the rewrite instruction.

2.if instruction

Use environment: server,location

This directive is used to check whether a condition is met and, if so, to execute the statement in curly braces. If directives do not support nesting and multiple conditions & & and | | processing.

3.return instruction

Syntax: returncode

Use environment: server,location,if

This instruction is used to end the execution of the rule and return the status code to the client.

Example: if the accessed URL ends with ".sh" or ".bash", a 403 status code is returned

Location. *\. (sh | bash)? $

{

Return 403

}

4.rewrite instruction

Syntax: rewriteregex replacement flag

Use environment: server,location,if

This instruction redirects the URI based on the expression, or modifies the string. Instructions are executed according to the order in the configuration file. Note that rewriting expressions are only valid for relative paths. If you want to pair hostnames, you should use the if statement, as shown in the following example:

If ($host ~ * www\. (. *))

{

Set $host_without_www $1

Rewrite ^ (. *) $http://$host_without_www$1permanent;

}

5.Set instruction

Syntax: setvariable value; default value: none; environment: server,location,if

This instruction is used to define a variable and assign a value to the variable. The value of a variable can be a combination of text, variables, and text variables.

Example: set$varname "hello world"

6.Uninitialized_variable_warn instruction

Syntax: uninitialized_variable_warnon | off

Use environment: http,server,location,if

This directive is used to turn on and off warnings for uninitialized variables. The default value is on.

5. Rewrite rule writing example of Nginx

1. Redirect to a php file when the accessed file and directory do not exist

If (!-e $request_filename)

{

Rewrite ^ / (. *) $index.php last

}

two。 Directory swap / 123456/xxxx = > / xxxx?id=123456

Rewrite ^ / (\ d +) /. + / / $2?id=$1 last

3. If the client is using an IE browser, redirect to the / ie directory

If ($http_user_agent ~ MSIE)

{

Rewrite ^ (. *) $/ ie/$1 break

}

4. Access to multiple directories is prohibited

Location ~ ^ / (cron | templates) /

{

Deny all

Break

}

5. Disable access to files starting with / data

Location ~ ^ / data

{

Deny all

}

6. Access to files with .sh, .flv, .mp3 suffixes is prohibited

Location ~. *\. (sh | flv | mp3) $

{

Return 403

}

7. Set browser cache time for certain types of files

Location ~. *\. (gif | jpg | jpeg | png | bmp | swf) $

{

Expires 30d

}

Location ~. *\. (js | css) $

{

Expires 1h

}

8. Set expiration time for favicon.ico and robots.txt

Here, favicon.ico is 99 days, robots.txt is 7 days and 404 error logs are not recorded.

Location ~ (favicon.ico) {

Log_not_found off

Expires 99d

Break

}

Location ~ (robots.txt) {

Log_not_found off

Expires 7d

Break

}

9. Set the expiration time of a file; here it is 600 seconds, and no access log is recorded.

Location ^ ~ / html/scripts/loadhead_1.js {

Access_log off

Root / opt/lampp/htdocs/web

Expires 600

Break

}

10. File anti-hotlink and set expiration time

The return412 here is a custom http status code. The default is 403. It is easy to find the correct hotlink request.

"rewrite ^ / https://cache.yisu.com/upload/information/20210524/347/788800.gif;" displays a hotlink protection picture

"access_log off;" does not record access logs to reduce stress

Browser cache of "expires 3D" for all files for 3 days

Location ~ * ^. +\. (jpg | jpeg | gif | png | swf | rar | zip | css | js) ${

Valid_referers none blocked * .linuxidc.com*.linuxidc.net localhost 208.97.167.194

If ($invalid_referer) {

Rewrite ^ / https://cache.yisu.com/upload/information/20210524/347/788800.gif;

Return 412

Break

}

Access_log off

Root / opt/lampp/htdocs/web

Expires 3d

Break

}

11. Only fixed ip is allowed to access the website with a password

Root / opt/htdocs/www

Allow 208.97.167.194

Allow 222.33.1.2

Allow 231.152.49.4

Deny all

Auth_basic "C1G_ADMIN"

Auth_basic_user_file htpasswd

12 convert the files in the multi-level directory into one file to enhance the seo effect

/ job-123-456-789.html points to / job/123/456/789.html

Rewrite ^ / job- ([0-9] +)-([0-9] +)-([0-9] +)\ .html$ / job/$1/$2/jobshow_$3.html last

13. Redirect when files and directories do not exist:

If (!-e $request_filename) {

Proxy_pass http://127.0.0.1;

}

14. Point a folder under the root directory to a level 2 directory

For example, / shanghaijob/ points to / area/shanghai/

If you change last to permanent, the browser address bar says / location/shanghai/

Rewrite ^ / ([0-9a-z] +) job/ (. *) $/ area/$1/$2last

One problem with the above example is that accessing / shanghai will not match

Rewrite ^ / ([0-9a-z] +) job$ / area/$1/ last

Rewrite ^ / ([0-9a-z] +) job/ (. *) $/ area/$1/$2last

This way / shanghai can also be accessed, but the relative links in the page cannot be used

For example, the real address of. / list_1.html is / area/shanghia/list_1.html will become / list_1.html, leading to inaccessible.

So it's not good for me to add automatic jump.

(- d $request_filename) it has a condition that it must be a real directory, but my rewrite is not, so it has no effect.

If (- d $request_filename) {

Rewrite ^ / (. *) ([^ /]) $http://$host/$1$2/permanent;

}

It will be easy when I know the reason. Let me jump manually.

Rewrite ^ / ([0-9a-z] +) job$ / $1job/permanent

Rewrite ^ / ([0-9a-z] +) job/ (. *) $/ area/$1/$2last

15. Domain name jump

Server

{

Listen 80

Server_name jump.linuxidc.com

Index index.html index.htm index.php

Root / opt/lampp/htdocs/www

Rewrite ^ / http://www.linuxidc.com/;

Access_log off

}

16. Multi-domain name shift

Server_name www.linuxidc.comwww.linuxidc.net

Index index.html index.htm index.php

Root / opt/lampp/htdocs

If ($host ~ "linuxidc\ .net") {

Rewrite ^ (. *) http://www.linuxidc.com$1permanent;

}

Six. Nginx global variable

The variable arg_PARAMETER # contains the value in the GET request, if there is a variable PARAMETER.

Args # this variable is equal to the parameter in the request line (GET request), such as: foo=123&bar=blahblah

Binary_remote_addr # binary customer address.

The number of body bytes sent out in the body_bytes_sent # response. Even if the connection is broken, this data is accurate.

The Content-length field in the content_length # request header.

The Content-Type field in the content_type # request header.

Value of the cookie_COOKIE # cookie COOKIE variable

Document_root # currently requests the value specified in the root directive.

Document_uri # is the same as uri.

Host # requests the host header field, otherwise it is the server name.

Hostname # Set to themachine's hostname as returned by gethostname

Http_HEADER

Is_args # if there is an args parameter, this variable is equal to "?", otherwise equal to "", a null value.

Http_user_agent # client agent Information

Http_cookie # client cookie Information

The variable limit_rate # can limit the connection rate.

Query_string # is the same as args.

The temporary file name of the request_body_file # client request principal information.

The action requested by the request_method # client, usually GET or POST.

The IP address of the remote_addr # client.

Port of the remote_port # client.

Remote_user # user name that has been authenticated by Auth Basic Module.

Request_completion # if the request ends, set it to OK. Null (Empty) when the request is not completed or if it is not the last in the request chain.

Request_method # GET or POST

Request_filename # the file path of the current request, generated by the root or alias directive and the URI request.

Request_uri # contains the original URI of the request parameter, without the hostname, such as "/ foo/bar.php?arg=baz". It can't be modified.

Scheme # HTTP method (such as http,https).

The protocol used by the server_protocol # request, usually HTTP/1.0 or HTTP/1.1.

Server_addr # server address, which can be determined after completing a system call.

Server_name # server name.

The port number that the server_port # request arrives at the server.

7. The correspondence between Apache and Nginx rules

RewriteCond of Apache corresponds to if of Nginx

RewriteRule of Apache corresponds to rewrite of Nginx

[r] of Apache corresponds to redirect of Nginx

[P] of Apache corresponds to last of Nginx

[Rmael] of Apache corresponds to redirect of Nginx.

[Apache] corresponds to Nginx's last.

[PT,L] of Apache corresponds to last of Nginx

For example, the specified domain name is allowed to access this site, and all other domain names are transferred to www.linuxidc.net.

Apache:

RewriteCond% {HTTP_HOST}! ^ (. *?)\ .AAA\ .com $[NC]

RewriteCond {HTTP_HOST}! ^ localhost$

RewriteCond% {HTTP_HOST}! ^ 192\ .168\ .0\. (. *?) $

RewriteRule ^ / (. *) $http://www.linuxidc.net[R,L]

Nginx:

If ($host ~ * ^ (. *)\ .AAA\ .com $)

{

Set $allowHost'1'

}

If ($host ~ * ^ localhost)

{

Set $allowHost'1'

}

If ($host ~ * ^ 192\ .168\ .1\. (. *?) $)

{

Set $allowHost'1'

}

If ($allowHost! ~'1')

{

Rewrite ^ / (. *) $http://www.linuxidc.netredirect

}

The above is how to achieve regular matching in the nginx shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 213

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report