What are the common problems and solutions of Linux operation and maintenance 02/12 Update SLTechnology News&Howtos

What are the common problems and solutions of Linux operation and maintenance

2026-02-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Linux operation and maintenance of common problems and solutions, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

As linux operators, they will encounter problems or failures of one kind or another more or less, and sum up experience, find problems, summarize and analyze the causes of the faults, which is a good habit of Linux operation and maintenance engineers. Every technological breakthrough experiences boredom and happiness, but we continue to work hard and accumulate more experience from it, which is the rich reward given to us by practice.

The following is a summary of the possible failures and solutions in the process of my project to see if they resonate with you and help you?

Highlights of FAQ

1. Shell script is not executed

(1) question:

One day, a colleague in R & D asked me to help him take a look at the shell script he wrote. I took a look, the script is very simple, there are no regular errors, reported ": badinterpreter:Nosuchfileordirectory" error.

Seeing this mistake, I asked him if he wrote the script under windows and uploaded it to the linux server. Sure enough.

(2) reasons:

In DOS/windows, the newline character of the text file is rn, while in the nix system, it is n, so the edited text file in DOS/Windows is added to nix with an extra ^ M for each line.

(3) to solve:

Rewrite the script under linux

Vi:%s/r//g:%s/ ^ M / / g (Ctrl+v,Ctrl+m for ^ M input)

Attached: sh-x script file name, which can be executed step by step and echo the results, which is helpful to troubleshoot complex script problems.

2. Crontab output result control

(1) question:

/ var/spool/clientmqueue directory occupies more than 100g of space

(2) reasons:

The program executed in cron has output, which is emailed to cron users, but sendmail does not start, so the files in the / var/spool/clientmqueue directory are generated, which may break the disk over time.

(3) to solve:

Delete directly manually: ls | xargsrm-f

Complete solution: add > / dev/2 > & 1 after the automatic execution statement of cron

3. Telnet is slow / ssh is slow

(1) question:

One day, a colleague in R & D said that there was an exception in accessing the 10.52memcached service on 10.50. let's check to see if there is anything abnormal in the network / service / system. Check found that the system is normal, the service is normal, 10.50ping10.52 is also normal, but 10.50telnet10.52 is very slow. At the same time, it is found that the namesever of the machine does not work.

(2) reasons:

BecauseyourPCdoesn'tdoareverseDNSlookuponyourIPthen...

Whenyoutelnet/ftpintoyourlinuxbox,it'lldoadnslookuponyou .

(3) to solve:

Modify / etc/hosts to make hostname and ip correspond

Comment out nameserver in / etc/resolv.conf or find a "live" nameserver.

4. Read-onlyfilesystem

(1) question:

Whether a colleague has successfully created a table in mysql is prompted as follows:

Mysql > createtablewosontest (colddname1char (1)); ERROR1005 (HY000): Can't createtable 'wosontest' (errno:30)

After checking the mysql user rights and related directory permissions, there is no problem; the prompt message with perror30 is: OSerrorcode30:Read-onlyfilesystem

(2) possible reasons:

File system corruption

The disk is bad.

Incorrect configuration of fstab files, such as incorrect partition format (writing ntfs as fat), misspelling of configuration instructions, etc.

(3) to solve:

Since it is a test machine, restart the machine and restore it.

It is said on the Internet that it can be solved with mount.

5. The file was deleted and the disk space was not released.

(1) question:

One day, it was found that the disk space used by a certain machine df-h was 90 GB, while du-sh/* showed that all the used space only added up to 30 GB.

(2) reasons:

Maybe someone directly used rm to delete a file that was being written, resulting in the problem that the file was deleted but the disk space was not released.

(3) to solve:

It is easiest to restart the system or restart related services.

Kill the process

/ usr/sbin/lsof | grepdeleted ora25575data33uREG65,654294983680/oradata/DAT

From the output of lsof, we can see that the process with pid 25575 holds the file / oradata/DATAPRE/UNDOTBS009.dbf opened with the file description number (fd) 33.

After we find this file, we can free up the occupied space by ending the process:

Echo > / proc/25575/fd/33

Cat/dev/null > file is commonly used to delete files being written.

6. Find files improve performance

(1) question:

There are a large number of temporary files containing picture_* in the tmp directory, and files from the day before are cleaned up at 2:30 every night. Previously, I ran the following script under crontab, but found that the script was inefficient and the load soared each time it was executed, affecting other services.

#! / bin/sh find/tmp-name "picture_*"-mtime+1-execrm-f {}

(2) reasons:

There are a large number of files in the directory, and using find consumes a lot of resources.

(3) to solve:

#! / bin/sh cd/tmp time= `date-d "2dayago"+% b% d" `ls-l | grep "picture" | grep "$time" | awk' {date}'| xa |

7. Unable to get gateway mac address

(1) question:

From 2.14 to 3.65 (mapping address 2.141) the network is down, but from other machines on the third side to the 3.65 network OK.

(2) reasons:

# arp AddressHWtypeHWaddressFlagsMaskIface 192.168.3.254etherincompletCMbond0

The superficial phenomenon is that the machine can not automatically obtain the gateway MAC address, the network engineer said that it is the problem of the network equipment, the specific is not clear.

(3) to solve:

Arp binding, arp-ibond0-s192.168.3.25400:00:5e:00:01:64

8. An example where the http service cannot be started

(1) question:

One day, a colleague in R & D said that the front-end environment http of the website could not be started, so I went up to have a look. Report an error as follows:

/ etc/init.d/httpdstart Startinghttpd: [SatJan2917:49:002011] [warn] moduleantibot_moduleisalreadyloaded,skipping Useproxyforwardasremoteip:true. Antibotexcludepattern:.*. [(js | css | jpg | gif | png)] Antibotseedcheckpattern:login (98) Addressalreadyinuse:make_sock:couldnotbindtoaddress [:]: 7080 (98) Addressalreadyinuse:make_sock:couldnotbindtoaddress0.0.0.0:7080 nolisteningsocketsavailable,shuttingdown Unabletoopenlog [FAILED]

(2) reasons:

Port occupied: on the surface, port 7080 is occupied, so netstat-npl | grep7080 looks and finds that 7080 is not occupied.

The port is repeated in the configuration file, if Listen7080 is written in both of the following files

/ etc/httpd/conf/http.conf / etc/httpd/conf.d/t.10086.cn.conf

(3) to solve:

Comment out / etc/httpd/conf.d/t.10086.cn.conf 's Listen7080, restart, OK.

9. Toomanyopenfile

(1) question:

Report toomanyopenfile error

(2) to solve:

* solution

Echo "> > / etc/security/limits.conf echo" * softnproc65535 "> > / etc/security/limits.conf echo" * hardnproc65535 "> > / etc/security/limits.conf echo" * softnofile65535 "> > / etc/security/limits.conf echo" * hardnofile65535 "> > / etc/security/limits.conf echo" > > / root/.bash_profile echo "ulimit-n65535" > > / root/.bash_profile echo "ulimit-u65535" > > / root/.bash_profile

* restart the machine or execute:

Ulimit-u655345&&ulimit-n65535

10. Disk space problems caused by ibdata1 and mysql-bin

(1) question:

2.51 disk space alarm. After checking, it is found that ibdata1 and mysql-bin logs take up too much space (among them, ibdata1 exceeds 120g, MySQL bin exceeds 80g)

(2) reasons:

Bdata1 is a storage format. In the INNODB type data state, ibdata1 is used to store the data and indexes of files, while the table files in the folder of the library name are just structures.

The innodb storage engine has two ways to manage table spaces, which are:

Shared tablespaces (which can be split into multiple small tablespace files), which is the method used by most of our databases at present

Independent tablespaces, each table has a separate tablespace (disk file)

For the two management methods, each has its own advantages and disadvantages, as follows:

① shared tablespaces:

Advantages: tablespaces can be divided into multiple files and stored on different disks (tablespace file size is not limited by table size, a table can be distributed over out-of-sync files)

Disadvantages:

If all the data and indexes are stored in one file, as the data increases, there will be a large file. Although a large file can be divided into multiple small files, multiple tables and indexes are mixed and stored in the table space. In this way, there will be a lot of gaps in the table space after a large number of deletions are done on a table.

In the case of shared tablespace management, once the tablespace is allocated, it cannot be retracted. When the operation table space of temporary indexing or creating a temporary table is expanded, there is no way to shrink that part of the space even by deleting the related table.

② independent tablespaces:

In the configuration file (my.cnf), set:

Innodb_file_per_table

Features: each table has its own independent tablespace; the data and indexes of each table will exist in its own tablespace.

Advantages: the disk space corresponding to the tablespace can be reclaimed (the Droptable operation automatically reclaims the tablespace, if the table after deleting a large amount of data can use: altertabletbl_nameengine=innodb; to retract the unused space.

Disadvantages:

If the single table increases too much, such as more than 100G, the performance will also be affected. In this case, if you use shared tablespaces, you can separate files, but there is also a problem. If the scope of access is too large, it will also access multiple files, which will also be slow.

If you use independent tablespaces, you can consider using partitioned tables to alleviate the problem to some extent. In addition, when independent tablespace mode is enabled, the setting of the innodb_open_files parameter needs to be adjusted reasonably.

(3) to solve:

① ibdata1 data is too large: only through dump, export the sql statement to build the database, and then rebuild the method.

② mysql-binLog is too large:

Manually delete:

Delete a log:

Mysql > PURGEMASTERLOGSTO'mysql-bin.010'

Delete the log from a certain day:

Mysql > PURGEMASTERLOGSBEFORE'2010-12-2213 purl 0000'

Set bin-log logs to be saved only for N days in / etc/my.cnf

Number of days automatically deleted by expire_logs_days=30//BinaryLog

Summary of troubleshooting

After reading the above, have you mastered the common problems and solutions of Linux operation and maintenance? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.