Backup scheme of high-capacity file server 04/12 Update SLTechnology News&Howtos

Backup scheme of high-capacity file server

2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

The company has a file server, which is provided to the design department to store drawings, 10T space, 9.2T capacity and about 1 million documents. Because the file server uses a low-end DELL R320 plus several SATA hard drives to make RAID5, for security reasons, it is necessary to back up the data.

We have also done some research on the backup scheme in the early stage, such as using the backup all-in-one machine or using the traditional scheme for backup. And made a comparison in various aspects.

Backup all-in-one machine traditional backup principle installs the backup agent on the file server of the design department, regularly backs up to the all-in-one machine through the network and replicates through DFS, writes the copy of the file to the new server in real time or on a regular basis, and regularly backs up the files stored on the new server to the local security because it uses linux as the bottom layer and is not vulnerable to blackmail virus *, files are not easy to be deleted, and windows is used with high security. Backup using backup tools on windows is vulnerable to extortion virus * *, there is a security risk, reliability using a separate all-in-one machine, only used for backup design department file server, high reliability using wsb, files are easy to be mistakenly deleted, resulting in unable to restore, reliability is lower than all-in-one machine, if DFS error, failure to synchronize, may lead to data loss of the underlying OS as windows, backup as a tool or program, OS failure As a result, the backup may not be able to restore the backup time according to the backup speed of 50MB/s, that is, the network traffic is 400Mbps, and the single full time is expected to be more than 60 hours. Except for the first synchronization, there is no need to occupy network traffic for a long time. The backup price on the same machine is 16000055000, which has the advantages of high reliability, simple operation, easy verification, easy error and low price. And the server can be used to provide file services (standby file server) with higher price, lower reliability and security than the backup all-in-one machine, and no after-sales technical support.

To put it simply, to use a backup all-in-one machine is to use a special all-in-one machine to back up only the 10T content. Because the content of 10T is 50MB/s according to the backup speed, it takes 58 hours to backup. If you use the original backup all-in-one machine, there may be resource conflicts in the backup of other data during this period of time. It is also possible that 10T data backup, plus other data backup to the same machine, this 10T data may take 3-5 days to complete. Therefore, a special backup all-in-one machine is needed to back up the 10T data.

The traditional solution is to use shadow copy + DFS+WSB. The specific operation is as follows: on the original server, we have made shadow copies of 10T volumes that store data, and if there is enough space, we can provide up to 64 previous versions. Buy a new server with available space of 50T, the price is about 55000 yuan, data storage is divided into two areas, one 20T space for DFS replicated folders, the other 30T space for backup using WSB. DFS, the distributed file system, can be used to protect against a single point of failure caused by hardware damage (unlike backup), so that if the original 10T server crashes, the machine can still be provided for user access. The backup of the data depends on windows server backup. After the verification of the test environment, we use the default setting of WSB, that is, full backup-copy backup. In fact, the backup will be deduplicated or what we actually do is incremental backup. Except for the first full backup, we will not back up 10 T of data each time, but only the incremental data. Through this scheme, we can resist a single point of failure caused by hardware. For a small number of file recovery, we can use shadow copies to restore, and if the server is infected with extortion virus, there are still backups that can be restored. Of course, the premise is that the server needs to patch and install genuine antivirus software.

The two plans were reported to the leader, and the leader decided to use the traditional backup scheme.

Next, we need to prepare for testing in the experimental environment, mainly considering the 10T data, the initial replication of DFS takes a long time. According to the experiment on Microsoft's website, 14 million files, a total of 10T, took 24 days to carry out the initial synchronization, which was formed into a kind of web-oriented * on Microsoft's website. The official recommended practice is to reduce the problem of initial DFS synchronization by pre-placing the data that exists on the source server on the target server, exporting the DFS database, and then importing the database to the target server.

Let me get this straight:

First, install DFS replication components on both servers. DFS replication is separate from the DFS namespace, and the DFS namespace may not be installed. Add the source server (10T) to the replication group and create a replicated folder. Make it the primary member of the initial replication.

II. Export the clone of DFS database on the source server

Third, use robocopy to copy the data from the source server (10T) to the target server (50T)

Import the DFS database imported from the source in the second step to the target server on the target server

Add the target server to the replication group

VI. Carry out the first synchronization

Pay attention to the problems:

1. The target server should not join the replicated group at the beginning, but wait until the data replication is complete and the DFS database import is complete.

two。 Do not confuse the target server with the source server, which refers to the server with authoritative data, which should be set as the primary server for the initial replication, also known as the upstream server; the target server refers to the new server, the 50T server without data, also known as the downstream server.

3. On downstream servers, do not manually create synchronized folders, these two folders should be created by robocopy to avoid file HASH inconsistencies, resulting in DFS initialization time is too long.

4. During the initial replication, the upstream server acts as the authority and synchronizes the downstream server. At this point, the data that exists on the downstream server and does not exist on the upstream server will be stored in the dfsrprivate\ preexiting directory of the downstream server. This directory is hidden and is in your synchronized folder. Show hidden and system files to see.

5. You can resize staging folders and conflicting and deleted folders appropriately. It is recommended that the staging folder be at least 4 the size of the largest file that needs to be synchronized, otherwise, an error may be reported.

Next, we started the experiment.

Experimental environment: there are two windows server 2012 virtual machines, dc201201 IP:192.168.99.98, which acts as the upstream server for DC and DFS replication, and dc201202 IP:192.168.99.99, which acts as the downstream server for DFS replication.

Experimental goal: there are thousands of files on the upstream server, which need to be saved to the downstream server by prebroadcast, so that the first synchronization can be completed quickly.

Copied folder name: departmental document

DFS replication

Install the AD domain service on DC201201 and promote it to the DC of the first domain in the new forest contoso.com

Second, after the domain control installation is completed, install the DFS replication management component on the two machines respectively.

Install-WindowsFeature-Name Fs-Dfs-Replication-IncludeManagementTools

Third, create partitions E and F on the two servers, both of which are about 50G. A departmental documents folder is created on the E disk of dc201201, and 4875 files are placed, with a size of about 900m.

Create a replication group on the upstream server, create a replicated folder, and specify the physical location of the folder on the upstream server.

Verification: 4112 appears in the event log waiting for DFS replication, which means that the server DC201201 is the home folder when the replication was initialized.

Export the clone database of the upstream server and the XML file of the volume configuration, and it also requires that any replicated folders on the volume are not in the initial synchronization stage.

Create an e:\ dfsrclone folder in this general PS, and make a clone of the information of the E disk. Note that this is not a clone of the real data file.

When implementing this step, it is recommended that users do not access the contents of the e:\ department documents, as this will reduce the efficiency of cloning. More importantly, users are absolutely prohibited from accessing files on the downstream (target) server until the initial replication is complete.

When cloning the database, it gives the commands on how to copy the data and the database to the downstream server, which can be executed according to the given command, and the target can be represented by the UNC path.

Verify that the export of the database and the initial replication of the upstream server are ready. Represented by logs 2402 and 2002, respectively. Log 2402 indicates that the cloned database has been exported and 2002 indicates that the initial replication is ready on the primary server.

Copy the data in the replicated folder to the downstream server using robocopy. It is important to note that the target path cannot be created manually on the downstream server, just let robocopy create it automatically.

Through the display of robocopy, a total of 4875 files were copied, 1002m, which took 21 seconds.

Orders executed

Robocopy.exe "e:\ departmental documentation"\\ dc201202\ E$\ departmental documentation "/ E / B / COPYALL / RVV 6 / WRV 5 / MT:64 / XD DfsrPrivate / TEE / LOG+:preseed.log

/ E:

Do not use the robocopy / MIR option on the root of the volume, do not manually create replicated folders on downstream servers, and do not run previously replicated robocopy files (that is, if you have to start over, delete the destination folder and file structure and actually start over). Have robocopy create all the folders and copy everything to the downstream server with the / e / b / copyall option each time you run it. Otherwise, you are likely to end up with a hash mismatch.

Next, we need to copy the database XML file of the exported E disk to the downstream server through xcopy, and at this time, in order to simulate the upstream server being used, several large files are copied to the upstream server.

7. Verify the file HASH, verify whether the hash values of the files on the two servers are the same. If the HASH is the same, it means that these files do not need to be synchronized at the time of initial replication. It is recommended to compare in multiple directories.

8. Import the database and XML files on the downstream server. Before importing, make sure that the e:\ system volume information\ dfsr folder of the downstream server does not exist.

You can also use the following methods to ensure that no files exist in DFSR.

1. Create a folder for e:\ empty

2. Robocopy "e:\ cmpty"e:\ system vloume information\ dfsr" / mir

This article skips this step.

9. Verify whether the database and XML files of the downstream server are imported successfully, and view them through the log copied by DFS. 2412, 2416, 2404. Seeing log 2404 indicates that the import has been completed.

Join the downstream server dc201202 to the replication group

To verify whether the initial replication is successful, you need to wait for DFS replication log 4104. Please note that log 4102 will not appear here. If the upstream data is changed after the clone is exported upstream, the files will be copied to the downstream server in an authoritative manner.

At this point, the replication of DFS is complete. Let's go to the downstream server to see if the data has been copied.

Verify the synchronization function and delete the red HDTV01.MKV file on the downstream server to see if the upstream will synchronize.

The verification results can be synchronized.

Shadow copy

Vssadmin add shadowstorage / for=d: / on=e: / maxsize=2GB

1. Shadow copies can support up to 64 previous versions, and if there are more than 64, it automatically deletes the first version.

two。 Shadow copies are not suitable for a large number of read and write files, which may lead to a high CPU occupancy rate

Here, my plan is to open shadow copies on both machines. The space occupied by the shadow copy is 1T on the upstream server and 2T on the downstream server. The experimental environment is 10%.

If the shadow copy is opened for the first time, it will be created automatically. In my real environment, 10T of data space, put the shadow copy into another 1T of space. 64 copies were saved.

Tip: shadow copies cannot be used entirely as backups, especially if you do not have enough disk space, if the disk is erased once completely, all shadow copies may be lost.

Backup

Install the windows server backup backup tool and make a default backup, which refers to a full backup and a copy backup. Next time sometimes I will perform a non-copy backup.

On the downstream server, install wsb through the wizard

Here, the default backup frequency is one or more times a day, and we can change the back schedule in the task plan to adjust the frequency.

Make the first backup, then add 1G files, and then make a second backup. View the total amount of space consumed by the two backups.

The used space of disk E is 4.42G, while that of disk F is 0.1g.

After the first backup, the space used on disk F is 6.02GB.

Copy two files with a total of 1G to the E:\ department document

Perform the backup again, and the F disk has 6.1g of used space.

Through this experiment, we made two full backups, the first one occupied 6G space, and the second backup only took up 0.1g. It does not take up the space of the sum of the data from the two backups. So it means that it is an incremental backup or deduplicated. (I have to check the information.)

Data restore:

Open windows server backup, click restore, and you can see the backup catalog by date.

Disaster recovery

If one day, the 50T server operating system fails and is reinstalled, how to make effective use of the data above?

1. Shadow copy:

There are four copies of the shadow before reloading.

Data after reinstalling the system:

two。 Backed up data

Before reinstalling the system, the data was backed up several times as follows

Enable the recovery wizard for WSB on the reinstalled system

Please pay attention to the picture above. From November 2 to 6, the bold number indicates that there is a backup. In this way, even if the system is reinstalled, the original shadow copy and backup data can be found. Of course, as the backup platform is the Windows platform, we must consider the security issues, one is to immediately patch, the other is to install antivirus software. Another point is that DFS real-time synchronization set in our experiment can also be selected as non-real-time, that is, how often, for example, from 0pm to 7:00 on Tuesdays and Saturdays, or even using robocopy. For example, my original company's file server is replicated by robocopy three times a week, so one advantage is that in case the original file server is ransacked and encrypted, I still have a buffer time so that I can lose data for three days at most.

Full-text script attached: (I am not familiar with PS script either. Except for exporting and exporting database and XML, I secretly thought that everything else could be implemented through GUI.)

On the upstream DC201201 server:

New-DfsReplicationGroup "VSS" |

New-DfsReplicatedFolder-FolderName "Department documentation" | Add-DfsrMember-ComputerName dc201201

Set-DfsrMembership-GroupName "VSS"-ComputerName dc201201-ContentPath "E:\ departmental documentation"-PrimaryMember $True-FolderName "departmental documentation"

Update-DfsrConfigurationFromAD

Get-WinEvent "Dfs replication"-MaxEvents 4 | fl

New-Item-Path "E:\ Dfsrclone"-Type Directory

Export-DfsrClone-Volume E:-Path "E:\ Dfsrclone"

Robocopy.exe "E:\ departmental documentation"\\ dc201202\ E$\ departmental documentation" / E / B / COPYALL / RVV 6 / WRV 5 / MT:64 / XD DfsrPrivate / TEE / LOG+:preseed.log

Robocopy.exe E:\ Dfsrclone\\ dc201202\ e$\ Dfsrclone

On the downstream DC201202 server (note: you may need to stop the DFSR service to perform the first step; be sure to start it up again so that you can run the import)

RD "E:\ System Volume Information\ DFSR"-Force-Recurse

Import-DfsrClone-Volume E:-Path "E:\ Dfsrclone"

Get-WinEvent "Dfs replication"-MaxEvents 10 | fl

Add-DfsrMember-GroupName "VSS"-ComputerName "DC201202" | Set-DfsrMembership-FolderName "departmental documentation"-ContentPath "E:\ departmental documentation"

Add-DfsrConnection-GroupName "VSS"-SourceComputerName "DC201201"-DestinationComputerName "DC201202"

Update-DfsrConfigurationFromAD DC201201,DC201202

Get-WinEvent "Dfs replication"-MaxEvents 10 | fl

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.