Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Server failure handling

2025-03-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

The traffic exit of the public network website in the computer room has reached the upper limit, and the core site has been accessed slowly and cannot be loaded.

1. Expand the flow

two。 Change the computer room, change the access address of the back-end Web cluster, distribute the Nginx configuration on some medium-traffic site servers to the server in computer room B, and then change the DNS resolution.

There is a system that can view all domain name traffic in real time, through vertical (how much traffic per server, current HTTP concurrency) and horizontal (how many domain names are running on each server, how many domain names are running on each server, what is the source of domain name access), to monitor the domain names in use on Nginx hosts, total stand-alone traffic, concurrency, single domain name traffic, and so on.

Note:

The importance of not touching the core site is self-evident.

Do not touch small traffic sites, because the migration of small traffic sites need to migrate multiple sites in order to have redundant traffic, obvious time delay.

When the system fails

Who's there? Don't have a few people to debug together.

# w

# last

What happened before?

# history

What is the process running now?

# pstree-a

# ps-aux

Eavesdropping network service

$netstat-ntlp

$netstat-nulp

$netstat-nxlp

Usually run these three commands separately. I don't want to see a list of all the services at once.

If you want to show all existing connections, netstat will be slow. You can first use ss to take a look at the overall situation.

CPU and memory

$free-m

$uptime

$top

$htop

Is there any CPU left? How many cores is the server? Are some CPU cores overloaded?

Where does the biggest load on the server come from? What is the average load?

IO performance

$iostat-kx 2

$vmstat 2 10

$mpstat 2 10

$dstat-- top-io-- top-bio, which allows you to see who is doing IO

Check disk usage: is the server's hard drive full?

Is swap switching mode (si/so) enabled?

Who is CPU occupied by: the system process? User process? Virtual machine?

Application fault

Apache & Nginx; looks for access and error logs, looks directly for 5xx errors, and then sees if there are limit_zone errors.

MySQL; looks for error messages in mysql.log to see if there are any structurally damaged tables, if there is an innodb repair process running, and if there are disk/index/query problems.

If PHP-FPM; sets the php-slow log, find the error message directly (php, mysql, memcache, …) If it is not set, set it quickly.

Varnish; in varnishlog and varnishstat, check the hit/miss ratio. See if there are any rules left out in the configuration information so that the end user can directly * your backend?

What is the state of the HA-Proxy; backend? Is the health check successful? Is the queue size at the front end or the back end at the maximum?

Never modify the currently connected server or network device interface

Be sure to prepare a recovery mechanism for your operation.

Automatic backup of network equipment configuration with tools can help you deploy alternatives within minutes when the switch does not work.

Back up each configuration file before making changes (.bak)

Carefully monitor every aspect of the data center, from indoor temperature, to racks, to servers-- in addition, server process checks, uptime checks, etc., and monitor bandwidth usage, temperature, disk partition usage and other important data metrics through trends and graphical tools.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report