Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the seven steps in the server maintenance list?

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you what are the seven steps in the server maintenance list. The article is rich in content and analyzes and describes for you from a professional point of view. I hope you can get something after reading this article.

In order to maintain the server effectively, the server administrator must perform proactive hardware and software checks. The maintenance list of the server must include dust removal, log viewing, software patch testing, and so on.

Even with server performance and redundancy, increased workload consolidation and reliability expectations can cause damage to server hardware.

The server maintenance list should cover physical elements as well as the software layer configuration of the system. You must also take into account the fact that thorough maintenance takes time, labor time, and testing. Using checklists helps server administrators define goals and keep the IT team up and running.

1. Develop maintenance procedures

Server administrators often ignore the scheduled maintenance window. Don't wait for a failure to start maintenance; set aside time for routine server preventive maintenance.

The frequency of maintenance depends on the useful life of the server equipment, the data center, and the number of servers that need to be maintained. For example, older servers located in equipment cabinets need to be checked more frequently than new servers deployed in highly particulate air filtered, well-cooled data centers.

The organization can develop a routine maintenance plan based on the routine procedures of the supplier or third-party provider; this schedule needs to be followed if the supplier's service contract requires a system check every four or six months.

two。 Prepare for downtime

Before you can solve the items on the server maintenance list, you need to make a plan. This includes checking the Syslog for errors or events that require more direct attention. If the system log indicates an error with a specific memory module, you should order a replacement dual in-line memory (DIMM) and install it. Similarly, if firmware, operating system, or agent patches / updates are available, test and review them before the planned maintenance window.

Make a clear plan to take the system offline and bring it back to service. Before virtualization, the server and the applications it hosts will need to be down to accommodate the maintenance window, but this forces server administrators to perform maintenance at night or on weekends.

Virtual servers support workload migration rather than downtime, so server administrators can migrate applications to other servers and they will remain available as long as server maintenance is performed on the underlying host system. Before servicing, you need to know where the virtual machine is going, migrate the virtual machine to the selected system, and verify that each workload is functioning properly before shutting down the server for maintenance.

At this point, the server administrator can shut down the server and remove it from the rack.

3. Check the airflow path

After the server goes down, you need to visually check its external and internal airflow paths. Remove all dust and debris that may hinder cooling air.

Start with the external air inlet and outlet, then enter the system chassis and view the CPU radiator and fan components, memory, and all cooling fan blades and ducts. After removing the server from the rack, you need to ensure that the server is clean. Clean, dry compressed air is used to remove dust or debris from the antistatic work area.

Dust removal is not a new process, but it is still necessary. Dust is an insulating material, so dust removal is particularly important because alternative cooling schemes and recommendations from the American Society of heating, Refrigeration and Air conditioning Engineers (ASHRAE) have increased the operating temperature of the data center. Dust and other airflow obstacles can cause the server to consume more energy and may even cause component failure.

4. Check the local hard drive

The server relies on internal hard drives for booting, workload startup and storage, and user data. Disk media problems damage the performance and stability of the workload and cause the hard drive to fail prematurely. Use a tool such as the check hard disk utility to verify the integrity of the hard drive and try to recover any bad sectors on the hard drive.

Hard drives with magnetic media are not perfect. Common problems include damaged sectors and fragmentation. RAID has made great progress in maintaining data integrity after a storage error, but the smaller 1U rack server does not provide enough physical space to deploy the hard disk array.

As long as the NT file system and file allocation table hard disk file fragments do not disappear, as long as the file system uses the hard disk space of the first available cluster. Fragmentation slows down the server's hard drive and causes a failure. The Optimize-Volume utility Windows Server 2016 defragments and processes the storage tier.

5. Verify log data and events

The server records a large amount of event information in the event log. It is not complete without a server maintenance list without carefully examining the system, malware, and other event logs. Of course, key system issues should be brought to the attention of managers and technicians immediately, but numerous minor problems can herald long-term problems.

When checking the logs, the administrator should check the report settings and verify the correct alerts and alert recipients. For example, when checking logs, the administrator should check the report settings and verify the correct alerts and alert recipients. For example, if a technician leaves a server group, you need to update the server's reporting system.

And check your contact information carefully. If the error occurs outside working hours, it will be invalid to report the error to the email address of the company where the technician works.

When a log check finds a long-term or recurrent problem, a proactive investigation can solve the problem before it escalates. If the server's log reports recoverable errors in memory, it will not trigger a critical alarm. However, if there are repeated situations indicating a problem with the module, the administrator can perform a more detailed analysis to identify the imminent failure.

If the problem is not serious enough to shut down the server, the administrator can restore the server to production until replacement hardware occurs.

6. Test patches and updates

The server's software stack (BIOS, operating system, hypervisor, driver, and application) must work together. Unfortunately, software code is rarely problem-free, so parts of this puzzle are often patched or updated to fix bugs, improve security, simplify interoperability, and improve performance.

No software should have automatic updates. The administrator should determine whether patches or upgrades are required, and then thoroughly evaluate and test the changes.

Software developers may not be able to test every possible combination of hardware and software, so they need to choose patches and updates wisely to avoid performance problems or workflow disruptions. For example, monitoring agent patches can cause important workload problems because the new agent consumes more bandwidth than expected.

The migration to DevOps has fewer and more frequent updates, which increases the likelihood of problems. The organization must still test any patches or updates in the lab before deploying them to sandboxie or the test installer, and always has the ability to restore the original software configuration.

7. Record all system changes

During the maintenance period, many things can happen to the server, such as hardware, software, or system configuration changes. After the server administrator has completed the server maintenance list, it is important to carefully examine them and record the status of all new systems. For example, changing the network adapter, adding or replacing memory, or updating the operating system changes the configuration of the system.

Organizations that rely on system configuration management tools may need to update or discover any changes that are recorded in the configuration management database before allowing the system to be put back into use. The server administrator must update any mandatory or required state configuration status to allow changes.

Also verify the security status of the system, such as firewall settings, anti-malware versions, or scanning frequency and intrusion detection settings. The security check ensures that changes to the system software do not inadvertently expose all attack surfaces that were turned off in the previous configuration.

Don't forget to update any system backup or disaster recovery (DR) content after the server comes back online.

Verify that the backup / disaster recovery frequency of the server remains the same unless any relevant settings must be specifically adjusted to reflect the new use case of the server.

These are the seven steps in the server maintenance list shared by the editor. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report