Basic knowledge and replication principle of Microsoft DFS 07/12 Update SLTechnology News&Howtos

Basic knowledge and replication principle of Microsoft DFS

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Not long ago, Lao Wang had the honor to participate in a bank's DFS consulting project, and he also took the opportunity to get started on DFS. Now he will sort out the knowledge he has learned and share it with you. Corrections are welcome.

DFS is a distributed file sharing service on Microsoft Windows Server. By using DFS, you can help enterprises access the contents of all shared folders through a single path. At the same time, you can automatically contact the nearest server according to the login location of the client, providing file server load balancing and fault tolerance.

The main functions of DFS are divided into two blocks to provide a unified entrance for the client to replicate different file server folders. The two functions are implemented by two components respectively.

DFS namespaces: can be installed on separate member servers or domain controls. As the name implies, namespaces provide users with a logical access path. For example, there are many windows shared servers, many NAS shares and linux shares in an enterprise. It is very inconvenient for users to remember one by one. At this time, you can provide a unified access name through the namespace server. Publish all the enterprise's shared servers to this access name, and users only need to remember this name to browse all the shared folders in the enterprise.

DFS namespaces are divided into independent namespaces and domain namespaces. Independent namespaces find a separate server and use the name of this server as DFS external access. The result is that one of the servers is down and the users cannot access it. However, the independent namespace can be deployed as a cluster role to achieve the high availability of the AP schema, and the other is the domain namespace. After this deployment model is deployed, When the user accesses it, it will be accessed by\\ domainname\ dfsrootname. The advantage is that through the domain naming deployment, the access name, the namespace server, the folder connected by the namespace, the folder target, and these metadata information will be stored in AD, which is actually in this location of the active directory database.

CN=,CN=Dfs-Configuration,CN=System,DC=Contoso,DC=Com

After doing so, users can register multiple namespace servers. For example, there can be two namespace servers that jointly support the DFS root path\\ domainname\ dfsrootname. Once one server goes down, DC will return the available address to the client when the client queries, always ensuring that the namespace access is normal.

The process for a DFS client to access a folder from the DFS namespace is as follows

Independent root namespace

The client enters\\ 08server1\ share\ docs access request

The client DFS client sends a query request to query the\\ 08server1\ share root target, and the request is sent to 08server1

08server1 returns the root destination address

Client queries docs target server like root target server

The root target server provides the client with a list of folder targets according to the target selection algorithm.

The client sends a request to the first file target server in the list

Domain namespace

The client enters\\ contoso.com\ share\ docs access request

The client sends a request to the DC server to query the root target server address

The DC server queries the AD database and returns a list of root target server addresses

The client selects the first root target and sends a request to the root server to the folder target

The root target server provides the client with a list of folder targets according to the target selection algorithm.

The client sends a request to the first file target server in the list

When DC or an independent namespace returns the root target server address to the client, the client is cached by default. The independent namespace is 300 seconds, and the domain namespace is 1800 seconds. The root target server address is not requested again within seconds.

In general, if the use of DFS is large, it is recommended to deploy the DFS namespace server separately. If there are not many requests, you can put it together with the DFS replication server, so that the DFS replication server can undertake both replication and namespace providing functions.

If only one namespace server is deployed, when the namespace server goes down, the client will not be able to access the share through the path scope and fall back to the single server model

If the namespace is deployed to the domain control server, it is easy to cause inconsistent access names, for example, client 1 points to domain control 1, client 2 points to domain control 2, domain control 1 deploys a namespace server, and domain control 2 deployment is not deployed, then the client pointing to domain control 2 will not be able to access the DFS root target name, so otherwise do not choose to deploy the namespace server with domain control Otherwise, all domain controls pointed to by the client deploy namespace servers.

The default target selection algorithm for DFS is as follows

1. Randomly arrange the target servers from the same site at the top of the list

two。 Customer external site targets are listed in the order from lowest to highest Cost for AD sites

3. Recommendations of the same Cost are grouped together

4. Targets are listed in random order in each group

The administrator can also manually modify the target selection algorithm through the DFS snap-in, for example, to provide the lowest Cost as the preferred

When the DFS target server starts, it will detect whether the current DC is a multi-site architecture, and if so, which site should I belong to? when the client sends a request to the DFS namespace server, the namespace will select the algorithm according to the above target to provide the client with a sorted list of target servers

If they are all within the same site, the client will randomly select the target server

When adding roles and functions to Windows Server, the DFS is divided into two, one is the DFS namespace, and the other is the DFS replication group

From the point of view of the namespace, it is mainly divided into the namespace server and the target server. Except for the namespace server, all the folder targets are the target server. When you enter my logical area, I will create a link for you in my namespace.

The introduction of replication group enables DFS not only to provide convenient access, but also to support automatic replication fault tolerance at the file level. By configuring replication groups, target servers can replicate folders with each other to achieve fault tolerance. After introducing the concept of replication group, each target server becomes a replication member server, and replication group member servers only support Microsoft windows server, but not other platforms. The process for using a replication group is as follows

Select the target server to participate in replication

Select the folder to copy on the target server

Choose replication topology, distributed, staggered, or no topology, interleaving means that each node replicates each other, distributed means that each node does not replicate, all replicates with a master node, and no topology is configured afterwards.

Configure replication bandwidth, replication time, replication file filter

Configure the primary server on the first replication

After the replication group is configured, only one target server provides services because the replication group is configured. On the contrary, all the target servers of the replication group can provide read and write functions by default. For example, site A has target server A, site B has target server B, two target servers are configured with replication, and the data in the folder is synchronized between the two sites. Site A client access will be the response of target server A, and site B client access will be the response of target server B. once one of the servers goes down, the next target server will be selected from the algorithm given by the namespace server. if the replication group is not configured, it is similar, except that each accesses its own files without replication mechanism.

By default, each replication group member server is a multi-master synchronization mechanism, that is, each node can modify folder data. 2008 supports the configuration of read-only replication group members, and read-only replication group members can only perform read operations and cannot write. Suitable for branch offices, scenarios that do not need to be written but read

After the DFS replication group is configured, 2008 starts to follow the RDC remote compression algorithm replication mechanism, that is, each replication only replicates modified data, DFS replication only supports copying closed files, such as office files, picture files, etc., the user will close after uploading, and will not always open. If files such as VHDX or SQL MDF are always open and will not be closed, then DFS is not applicable. They will never be copied. DFS replication does not have version control. If a file is opened in both parties at the same time, the closing side of the file will prevail.

DFS replication uses port 135and RPC dynamic ports by default, and you can fix the DFS replication RPC port with the following command

Dfsrdiag staticrpc / port:55555 / mem:dfs01

Dfsrdiag staticrpc / port:55555 / mem:dfs02

Next, let's take a look at how DFS replication works.

Components involved

GUID:DFS replication uses GUID as the identity, and each replication group, replication folder, each replication group member, and each replication folder volume DFSR database will be assigned a GUID

USN Journal log: DFSR monitors file changes through NTFS's USN log. For USN Journal, it is a circular log defined as one of the NTFS specifications. Changes in files and folders on NTFS volumes are recorded in the USN log. Records generally include: file name, change time, change type and a USN unique update number, but the actual data is not recorded, so the record file can be kept small enough. Applications can monitor this USN log for NTFS file system updates.

Every file in NTFS can query its USN log. The query command is as follows

Fsutil usn readdata c:\ usn\ 123.txt

If we modify the file and look at the USN log again, we can see that the USN number has changed, and the "file reference number" of the file ID on NTFS and the "parent file reference number" indicating the parent folder have not changed.

When DFSR detects that an USN log has been added to a file in the replication folder, it adds updates to the file to the database managed by DFSR

The DFSR service maintains an ESE database for each volume on the volume that hosts the replication folder. DFSR uses this database to store metadata about each file and folder in the replication folder

In the DFSR database, the following information is associated, and you will often see these five numbers when debugging the replication status tracked by the DFSR log

O UID

O GVSN

O File name

O NTFS file ID

O UID of the parent folder

DFSR uses two different ID, UID (unique Identifier) and GVSN (Global version Serial number), to track the status of replication.

UID is constructed based on the modification of the database GUID (the volume where the replication folder resides) and the current database version number. It is the only ID assigned to files and folders, and is assigned to each replication file and folder. Once assigned, UID will not be changed until the file or folder is deleted.

GVSN is constructed based on the modification of the database GUID (the volume where the replication folder resides) and the current database version number, and is assigned to each replication file and replication folder, and a new GVSN is assigned each time the file or folder is updated

Both UID and GVSN are written in the following format.

{DB GUID}-version

The actual form is as follows

{0440DC0A-B3D0-49EC-AD01-B5A236AAF788}-v12

The first half of {0440DC0A-B3D0-49EC-AD01-B5A236AAF788} is based on the GUID of the volume DFSR database where the replication folder is located, and the V12 part is that DFSR has recognized the updated sequence number. By combining these two messages, we can get a unique ID.

UID and GVSN are consistent only when the file or folder is initialized, and once the file or folder changes, the GVSN changes and the UID remains the same.

Experiment verifies the change of UID GVSN during DFS replication

Environment introduction

One DC and two DFS server, each hosting the DFS namespace and DFS replication group roles

Replication group name\\ oa.com\ share\ doc

The doc directory that exists on DFS01 and DFS02 C disks is designated as the replication folder

Currently create a cc.txt file in the DFS01 server doc directory

Use the following command to query the UID and GVSN of the current DFS replication folder

Wmic / namespace:\\ root\ microsoftdfs path DfsrIdRecordInfo where "filename like'% cc.txt%'" get * / format:textvaluelist

{8F3671EF-8AF6-4D15-B59B-B4BF3CB52DD7} is the DFS01 DFSR database GUID. You can see that UID and GUID are consistent during initialization.

You can see it through the DFSRDIAG Guid2name command.

Dfsrdiag guid2name / RGName:doc / guid: {8F3671EF-8AF6-4D15-B59B-B4BF3CB52DD7}

The number of UID GVSN consists of the DFSR database + version number of the volume where the replication folder is located.

Next, after the DFS02 editor modifies the CC.TXT, check the UID GVSN on the DFS02 server again. You can see that the UID has not changed, but the GVSN has changed.

We use the dfsrdiag guid2name command again to check the DB GUID

Dfsrdiag guid2name / RGName:doc / guid: {6B8002DE-784B-45AA-B566-9114DC96C959}

You can see that the current replication group GVSN is DFS02's DFSR database, and CC.txt is last updated in DFS02.

When DFSR receives a change in GVSN, it notifies other nodes to update it and transmits incremental data through RDC.

If DFS01 updates the content again, DFS01's DFSR database replication group ID will become GVSN again, but the version number will be increased

From this, we can briefly summarize the principle of DFS replication. When a file or folder changes, NTFS USN records the change and updates the USN number. The next time DFS queries the USN log from NTFS, you can see the update, then update the DFSR database ID of the volume where the member replication folder resides, and update the database ID to the DFSR replication group GVSN,DFSR to know that the file or folder has changed on this server. Inform other nodes to use RDC to copy the incremental content and maintain the consistency of the DFS version of each node to the scale.

DFS replication recommends that replicated folders replicated through DFS always replicate only the confirmed result set data. For example, if the DFS replication directory contains production data and a TEMP folder, and the TEMP folder will be deleted and modified continuously with the development test, the DFS directory will be replicated frequently due to frequent changes to the TEMP directory inside, and if the program writes repeatedly Or frequently delete and add files from the directory, the file will be discarded to the Conflict and Deleted folder.

Answer to the usage of DFS shared folder DfsrPrivate directory

Staging: DFS copy temporary storage folder. All files to be copied will be placed in this folder and then pushed to other nodes. It is recommended to set the temporary storage size as large as possible.

Conflict and Deleted: a modified file used to deal with a conflict by the discarded party, such as a file, node 1 and node 2 are modified at the same time, node 2 is last modified, and the modified version of node 2 takes effect as the latest version, and the modified version of node 1 is discarded to this folder, and the files deleted during replication will also be placed in this folder

Deleted: if unchecked under membership to move deleted files to conflicting and deleted folders, the deleted folder will take effect and deleted files will be placed in the deleted folder

Installing: when a file exceeds 64KB, it will not be copied directly to the other node. The file to be copied will be first placed on staging floder after RDC calculation. When replication occurs, it will first be copied to the Installing path on the other node, and then placed under the correct path.

PreExisting: when initializing replication, for example, if you want to copy from DFS01 to DFS02,DFS02, if files already exist in the replication folder, the existing files will be placed under the PreExisting path, and the folders in the PreExisting path will not participate in DFS replication

DFS monitoring and maintenance

DFS supports CMD,Powershell,WMI,MMC management, and DFS monitoring can be basically monitored from event manager-application and service logs DFS Replication, performance counters, and SCOM.

More in-depth DFS also has a detailed debugging log similar to windos cluster log, which is located by default in the C:\ Windows\ debug directory

The detailed log management of DFS is as follows

Settings: debug log severity

Default value: 4

Range: 1-5

WMIC syntax:

Wmic / namespace:\\ root\ microsoftdfs path dfsrmachineconfig set debuglogseverity = 5

Settings: debug log messa

Default value: 200000

Range: 1000 to 4294967295 (FFFFFFFF)

WMIC syntax:

Wmic / namespace:\\ root\ microsoftdfs path dfsrmachineconfig set maxdebuglogmessages = 500000

Settings: debugging log fil

Default value: 100

Range: 1 to 10000

WMIC syntax:

Wmic / namespace:\\ root\ microsoftdfs path dfsrmachineconfig set maxdebuglogfiles = 200

Settings: debug log file path

Default value:% windir%\ debug

WMIC syntax:

Wmic / namespace:\\ root\ microsoftdfs path dfsrmachineconfig set debuglogfilepath = "d:\ dfsrlogs"

Note: the path must be created manually; if not, the default value of% windir%\ debug will be used when the service is restarted.

Settings: enable debug logging (debug logging is enabled by default)

Default value: TRUE

Range: TRUE or FALSE

WMIC syntax:

Wmic / namespace:\\ root\ microsoftdfs path dfsrmachineconfig set enabledebuglog = true

The following is a process that is updated in the detailed log

20180326 09:52:25.365 2612 INCO 4825 InConnection::UpdateProcessed Received Update. UpdatesLeft:0 processed:1 failures:0 sessionId:3 open:0 updateType:0 processStatus:0 connId: {C05077DD-90EF-4059-A695-E5158F8E4DB5} csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348} csName:doc update:

+ present 1

+ nameConflict 0

+ attributes 0x20

+ ghostedHeader 0

+ data 0

+ gvsn {6B8002DE-784B-45AA-B566-9114DC96C959}-v13

+ uid {8F3671EF-8AF6-4D15-B59B-B4BF3CB52DD7}-v11

+ parent {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348}-v1

+ fence Default (3)

+ clockDecrementedInDirtyShutdown 0

+ clock 20180326 01 0x1d3c4a516a7334d 52R 25.258 GMT

+ createTime 20180325 13 40 27.685 GMT

+ csId {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348}

+ hash DB24292A-77575CB4-2B878C24-FC62C351

+ similarity 00000000-00000000-00000000-00000000

+ name CC.txt

20180326 09 INCO 52V 25.365 2612 INCO 5551 InConnection::CommitSession Connection in sync connId: {C05077DD-90EF-4059-A695-E5158F8E4DB5} csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348} csName:doc commitedSessionsWithUpdateFailures:0

20180326 09:52:25.365 2612 IINC 392 IInConnectionCreditManager::ReturnCredits [CREDIT] Credits have been returned. CreditsToReturn:1 totalConnectionCreditsGranted:0 totalGlobalCreditsGranted:0 csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348} csName:doc connId: {C05077DD-90EF-4059-A695-E5158F8E4DB5} sessionTaskPtr:00000000004BF350

20180326 09:52:25.365 2612 UPMG 427 UpdateWorker::ConsumeUpdates No pending updates. ConnId: {C05077DD-90EF-4059-A695-E5158F8E4DB5} csName:doc csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348}

20180326 09 INCO 52V 25.365 2140 INCO 8561 InConnection::InConnectionContentSetContext::Hibernate Hibernating: connId: {C05077DD-90EF-4059-A695-E5158F8E4DB5} csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348}

20180326 09 UPMG 52V 25.365 2140 UPMG 580 UpdateManager::FinalizeUpdateManager Finalizing UpdateManager connId: {C05077DD-90EF-4059-A695-E5158F8E4DB5} csName:doc csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348}

20180326 09 OUTC 52V 25.381 2040 OUTC 2885 OutConnectionContentSetContext::TransportRequestVvUp Received request for VvUp csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348} csName:doc connId: {FA4D1251-E628-47E5-8448-13905E9C9ECE} rgName:doc ptr:00000000004BEF20

20180326 09:52:25.381 2040 SRTR 2242 SERVER_RequestVersionVector Sent requested version vector. ConnId: {FA4D1251-E628-47E5-8448-13905E9C9ECE} csId: {41BBE4AC-6CE0-421A-AFE9-6E9420EA1348} seqNumber:6 requestType:REQUEST_NORMAL_SYNC changeType:all

20180326 09:52:25.381 2040 SRTR 2331 SERVER_AsyncPoll Processing AsyncPoll call. ConnId: {FA4D1251-E628-47E5-8448-13905E9C9ECE}

20180326 09:52:25.397 2040 SRTR 2331 SERVER_AsyncPoll Processing AsyncPoll call. ConnId: {FA4D1251-E628-47E5-8448-13905E9C9ECE}

Introduction of common parameters

Parameters.

Description

Current case

CSID

Copy folder GUID

{41BBE4AC-6CE0-421A-AFE9-6E9420EA1348}

ConnID

Copy Link GUID

{C05077DD-90EF-4059-A695-E5158F8E4DB5}

Parent

The folder where the files are copied is ID

{41BBE4AC-6CE0-421A-AFE9-6E9420EA1348}-v1

UID

Original document record

{8F3671EF-8AF6-4D15-B59B-B4BF3CB52DD7}-v11

GVSN

Modify file record

{6B8002DE-784B-45AA-B566-9114DC96C959}-v13

Other diagnostic tools for DFS

Dfsdiag, mainly used for DFSN functions, to help test the connection to AD and the configuration of DFS

Dfsrdiag for diagnosing and troubleshooting DFSR replication

DFS Diagnostic report for administrators to display replication reports through a graphical interface

It is not recommended to perform system cloning on DFS, and it is recommended to use standard backup scheme for disk backup or bare metal backup of DFS server.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.