Thoughts on the deletion event of Shunfeng-sharing of database data recovery methods 07/13 Update SLTechnology News&Howtos

Thoughts on the deletion event of Shunfeng-sharing of database data recovery methods

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

It is reported that a Deng in the Shunfeng Technology data Center mistakenly deleted the production database, resulting in a service that could not be used and lasted 590 minutes. After the incident, Shunfeng dismissed Deng and criticized him in the whole network of Shunfeng science and technology. I really played "from deleting the library to running away".

There is no doubt that we suddenly felt as if we had been beaten with chicken blood, straightened our collars, straightened our chest, immediately exploded our sense of existence, pulled a small bench, and talked about it in the moonlight of the Mid-Autumn Festival.

Back then, China's institutions, equipment upgrading, production warehouse online hot relocation, script errors, rm dropped, and then, we XXX, all restore all the data (tens of thousands of words are omitted here, including dozens of self-proclaimed "Niu X" auxiliaries). Unfortunately, we have to keep it a secret for our users.

Back in those days, that institution, because of that, and then,. Oh, forget it. I can't say. It's an old legend anyway.

I can't say anything, just talk about it from a technical point of view, from deletion to recovery, and then to the life of being unable to run.

I certainly won't talk about finding a fee or open source data recovery software to restore. I can't afford to lose that person.

Don't talk about Windows, because it has nothing to do with it.

Only when oracle, db2, mysql and Hadoop are deleted from Unix or Linux, take rm-f as an example.

The carrier of the database can be implemented in many ways, such as files or bare devices. In most cases, the system manages database data files in the form of files (everything is files). A set of databases, in simple terms, can be understood physically as one or more files. Delete a library, that is, delete one or more files.

Files are stored in the file system. There are many file systems on Unix and Linux, which retain the same VFS file access interface, ensuring that users use each file system transparently (of course, there are some small differences, but they generally follow standards such as POSIX). But in fact, different file systems vary in kernel design, which leads to different underlying performance of rm-f, and then leads to different possibility and difficulty of recovery of each file system after rm-f. Simply put, the recovery after deleting a file is not a technical detail agreed in the file system specification, and the file system designer has not considered it at all.

The general idea for restoring a deleted library on the file system should be as follows:

Figure 1: ideas for restoring deleted databases

The above method A:

It refers to the recovery of files as the object, that is, to restore some files deleted (or lost) on the file system, do not care about the contents of the files, and only analyze and recover through the metadata of the file system. Metadata is the management information of a file system, which generally does not take the user file as the carrier, but can only be obtained and analyzed through the binary stream of the underlying block.

In a file system, the addressing of files is roughly such a process that almost every file system is no exception within the scope of this article.

Figure 2 File addressing chain

The "node" represents the summary information of a file (or directory) and also includes a pointer to the next layer of data unit, which refers to one or more points, which may be some additional information, but certainly indicates the "block index".

A "block index" refers to pointer information that points to a real data area.

The "block" is the data itself.

Unless you enable TRIM to notify the hard disk to clear data on media such as SSD, for efficiency, deleting data will not clear the data area, but will only be labeled as reusable, which is the principle of file recovery.

Point 1: the first method that is likely to be particularly useful for file deletion recovery is lsof.

After the file is deleted using rm-rf in the Linux system, the file node is only removed from the directory tree, and the file content is still waiting for collection in the system background. At this time, there is an opportunity to use the system process number to copy the file.

# lsof | grep data.file1

# cp / proc/xxx/xxx/xx? / dir/data.file1

This method has little to do with my major. Check google for details.

If the lsof cannot be found, then you can consider deleting and restoring from a file system perspective.

According to the recovery method, file systems are roughly divided into three categories:

Category I:

UFS (used by Solaris, BSD, etc.), Ext2/3/4 (the most common file system for Linux), JFS (the earliest use of Aix), OCFS1/2,HTFS (SCO).

This type of file system uses a fixed node length and a fixed node area. All files (or directories) on the file system have a unique catalog correspondence in the node table, which is used as the starting point for addressing files. When deleting a file, this kind of file system generally clears the node 0 (because the node number and location are physically fixed, it must be reused after deleting the file, and the node area operation is also more centralized and frequent. The general design will easily clear 0 when deleting), after clearing 0, the link between the node to the data block index is broken, the original one-to-one mapping relationship has become N to NQuery N is the total number of files.

Therefore, when this kind of file system is restored after deleting files, it is often difficult to correspond to names and directories, such as hospital PACS, OA, mail system, voice database, geological sampling, multimedia materials, and database files. (interruption: we, the North Asia data recovery Center, www.frombyte.com, depend on money, this kind of work that no one does, we take).

In particular, why can some open source data recovery software on linux, such as ext3grep, recover deleted files on Ext3/4?

Because the Ext3/4 file system supports log rollback, the system will create a log file (invisible to users) when formatting. The typical size is between 32M~128M. When deleting a file, the node will first copy it to the log file and then delete it to ensure that the operation can return to the last clean and stable state when the operation is accidentally interrupted.

But the disadvantage is also obvious, the log is constantly rolled back, if the time is too long, or frequent file system operations, it is not so easy. Typically, if you delete a large number of files, only some of them can be recovered by this method.

Of course, you can give me another plan. Point 2: after deletion, if lsof cannot handle it, if possible, the dd file system will be archived and protected as soon as possible. Don't think that miracles will happen in ls,find constantly, which will make things worse.

II class:

XFS, ReiserFS, JFS2 (for AIX), ZFS, NetApp WALF, EMC Isilon, StorNext, NTFS.

In this type of file system, the node area is a variable area, and the data is often not erased when the data is deleted.

The reasons are probably:

1. If the area is variable, the node size can be set larger, and the removal is a waste of performance.

2. If the area is variable, the buffer may not be hit very well. When clearing the node, you only need to tamper with the bitmap.

3. If the area is variable, you can consider the region redirection, so there is no need to pay attention to the original region.

If the node is not cleared, it means that the chain of "node-> block index-> data block" will not be broken, so it is easy to recover the data.

In fact, it is difficult to delete a file, it must be released at the file system level, so it is possible that the whole chain of "node-> block index-> data block" will become free, or like zfs, there will be a lot of copies and sorting will be difficult. There are many correlations between different file systems, for example, the index block in JFS2 records the previous item and the next item, and ZFS records the HASH of the next item in the node. According to these matching points, you can match and find the most appropriate starting point for data recovery. Due to different file systems, there are different targeted methods.

III class: something else.

For example, Vxfs and HFS+ are like II in structure, but because the node area is often concentrated in the front and the hit rate is high, in design, deleting files will clear 0 or reconstruct the tree, and the difficulty of data recovery is as difficult as Class I.

For example, ASM, strictly speaking, is not much like a file system. Anyway, the conclusion is that the file system itself does not have a very good algorithm to ensure that it can be recovered after deletion (but depending on the internal structure of the file, the reliability of recovery is very high).

For example, VMFS is basically a chunk of allocated file system, recovery methods and programs are different from this article, a bit more content, free to talk about.

The above is described for the idea of fully recovering files, but as mentioned above, sometimes the files cannot be recovered, or some of the files may be destroyed or overwritten.

If the contents of the file are still there, but the metadata part of the file system can no longer support the restoration of the file information, you can consider doing something about the relevance of the file content.

For example, Oracle data files, in most cases, press 8K for the page size, fortunately, the header of each page has a page check, page number and possible file number. After page-by-page check scans all the data pages from the bottom of the disk, count the file number and page number, and if you are lucky, the files may be stitched together.

The control file of Oracle records the logical and physical relationship between the data files. After analysis, the file name and path are not difficult to restore.

The same method can be applied to Sql Server and MySQL InnoDB. Thinking a little flexible, can adapt to Sybase, DB2 and so on.

If it's a MySQL MyISAM engine, there's a way. Records are pressed into the file one by one, and if a table has a primary key, or a special field, or a special table structure, you can classify all eligible blocks on disk. There will also be row overflow and row migration in MyISAM, that is, there is a data association relationship between An and B. according to this relationship, we can further match the arrangement logic of block records and combine data files. For MyISAM, this is also a way to restore tables or records.

This is the idea of restoring the complete file according to the contents of the file. What if the contents of the document are incomplete, or if there are too many copies, it is too difficult to arrange weights? -restore table or table record.

According to the differences between tables, in general, it is easy to classify all the fragments that may be databases, and the most direct way to classify them is by table.

After the table is summarized into the same group of units, it can be sorted and misarranged from the point of view of recording. If the recovered table unit can be easily cleaned with the help of index, space allocation, other associated tables and other information, if lucky, the data may be complete.

If the table is summarized into the same unit, does not correspond to the index, there are error records, etc., resulting in the database can not be repaired to start, according to the table structure, the table unit can be extracted in the way of record-> insert new library-> data directional cleaning. Although the result may not be perfect, in many cases, it is better than nothing.

There is also method C2 in figure 1 to restore records from the log. This log is in a broad sense, including archiving, procedural statement statements, and all data sets that may have traces of record. In cases where the master data file is corrupted, any dataset that may contain records should be the object of analysis. Also like the database file, according to the file, the structure block, the record train of thought carries on the maximum degree recovery. With the combination of C1, C2, C3, and directional data collection and data cleaning, the means of data recovery will come to an end.

Forgot to talk about Hadoop. When Hadoop,Hbase deletes, it triggers the deletion of files on the node file system. Take the most common Linux as an example, it is actually the recovery problem of deleted files on Ext3/4. If the files cannot be restored, refer to Hadoop's HASH, fsimage and so on for data block association. Like the idea of the database above.

It is obvious that the more backward the recovery method is, the more problems there are in the summary of production data, and the investigation and correction of data logic will make too many people sleepless at night and gnash their teeth. Well, you'd better obediently buy coffee for everyone, contribute annual salary and funds to your boss, and pretend to be dishevelled and sad. maybe everyone can promise to let you sleep for two hours a day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.