Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Discussion on WSFC AD&SMB dependency

2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Although WSFC is only a function of Window Server, in fact, the internal component synergy of this product and the synergy with other Microsoft solutions are particularly strong. Compared with other Microsoft products, Lao Wang thinks that WSFC's MSDN blog is very good. He has written many blogs about the operation mechanism of WSFC's internal components. He can quickly master it by reading and practicing. As for the synergy with other products, for example, WSFC will cooperate with AD and SMB to provide basic operation of clusters. The upper layer application also has exchange dag, sql ag based on WSFC to realize application high availability, and the outer layer management also has SCVMM, Honolulu, OMS, SCOM, SCO and other management suites. It can be said that WSFC is not only a logical high availability implementation of data center operation, but also a transit station that cannot be avoided by continuous operation of applications.

In this article, Lao Wang wants to discuss with you AD and SMB, which are relied on in the basic operation of WSFC, under what circumstances will they rely on, why they rely on them, and what effects will be produced without them.

First of all, let's talk about AD. There's nothing to say. I believe everyone has some understanding of Microsoft AD domain. It's also one of the few products Microsoft can take out. AD's main purpose is centralized authentication. Through a set of identity database, it can provide unified authentication for users, computers and applications in the environment. Further, it can also do centralized management, policy distribution and resource distribution for the working environment.

The most directly related to WSFC is authentication. We know that WSFC deployment can be divided into normal AD integration, RODC, no CNO, workgroup, multi-domain deployment, among which no CNO, workgroup mode, multi-domain deployment mode is similar, all of which do not generate computer objects in clusters. The effect of this is that cluster applications cannot perform Kerberos authentication, only NTLM authentication, or a separate authentication mechanism, such as SQL SA authentication. The clusters deployed in these three deployment modes are also limited in the applications they can support, such as MSMQ is not supported, keranthus authentication is not supported for file server clusters, and hyper-v clusters do not support live migration.

The root cause is that because there is no CNO object in the cluster, we all know that one of the main implementations of a highly available cluster is to behave like a computer to the outside world. The application does not know that I am interacting with a cluster. It only knows that I am interacting with a computer, and I should be able to interact with it all the time. However, in fact, this computer is logically constructed. Behind it, the cluster components will coordinate the services provided by different nodes. Normally speaking, this external computer should have its own netbios name, dns name, computer object, to be considered a complete computer that can be accessed and authenticated. If we adopt CNO, workgroup mode, multi-domain deployment, then this cluster logical computer only has a DNS name to the outside world, and cannot provide authentication services.

In an AD domain environment, when a user sends a login domain request, the local Local Secularity Subsystem will first apply to the domain control for the user's session ticket. If the user account password is correct, it will issue the user ticket. After obtaining the user ticket, the LSS will also apply to the domain control for the computer's session ticket to verify whether the computer password is correct. If it matches the AD, it will issue the session ticket. After obtaining these two tickets, the local LSS will construct an access token for this login. Subsequent users can use this token to execute the program, knowing that this process is very critical, that is to say, each authentication login needs to take the user ticket and computer ticket with AD, once one of them fails, the authentication fails, and the login cannot be performed.

This is also very important in WSFC environment. For example, under normal circumstances, if I configure a DTC cluster, external programs can directly authenticate through the CNO of the DTC cluster if they want to access it. The purpose of this is to provide a unified logic for users. When a node goes down in the background, users still use the same name to access it, but another node provides services. However, if there is a problem with the computer object on the node where the DTC is currently located, for example, the computer object has been deleted by mistake. At this time, you can see the log through the cluster log. RHS will periodically check the machine network name and will always detect that the machine network name cannot be accessed. The current node cannot contact AD and cannot log in. At this time, the cluster will not fail over due to this. However, the user program can sense that the cluster cannot perform authentication at this time. From this point of view, although we have a logical CNO, we still need to pay attention to the AD computer status of each node. Therefore, the specific authentication will still involve the node where the application is located. At this time, the direct effect is DTC. The cluster application cannot perform authentication because the node cannot log in to AD normally to get the computer ticket. The processing method is to finally troubleshoot the problem. Transfer DTC to other nodes that can normally get ticket with DC, restore normal authentication, and then restore deleted computer objects from AD.

Whether the node domain computer object is normal or not can directly affect the cluster application that needs authentication. Therefore, the service account of the cluster application, the computer object, and the CNO object are equally important, and they are all the contents that need to be paid close attention to in AD cluster.

In addition to relying on computer tickets, WSFC also relies on AD to implement Kerberos authentication, such as SQL clustering. When SQL Server uses Windows authentication, SQL Server indirectly supports Kerberos through Windows Security Support Program Interface (SSPI). By default, SQL will first try to negotiate through SSPI and AD. If Kerberos authentication can be performed, it will pass Kerberos authentication. If it cannot pass, it will fall back to NTLM authentication. Therefore, many SQL administrators often encounter authentication problems related to SPN. SQL cluster configuration will write the SPN value unique to the service account, if the write fails, or later deleted by mistake, unable to verify the SPN, then Kerberos authentication will fail, so for programs using Kerberos authentication, you also need to pay attention to its service account and CNO VCO object SPN value, you can record it when healthy, so that it can be supplemented in the future recovery

CNO VCO writes three properties to AD computers:

DnsHostName: FQDN value

ServicePrincipalName: SPN value

HOST /NetBIOS name of virtual server

Virtual Server for HOST / FQDN

NetBIOS name of MSClusterVirtualServer /Virtual Server

MSClusterVirtualServer / FQDN Virtual Server

NetBIOS name of MSServerCluster /Virtual Server (this SPN is created only for default cluster names)

MSServerCluster / FQDN (this SPN is created only for the default cluster name)

DisplayName: NetBIOS name of the network name resource

The SPN value above is the SPN registered by default after cluster installation. If some applications run based on the upper layer of the cluster, the SPN will also be registered to CNO or VCO if the application needs to be authenticated by keranthus. In the past, some third-party software will register the SPN only for a single node during registration, resulting in users needing to access the new name for authentication after failover. Later, most applications will directly register the cluster CNO name. When one node goes down, it will automatically switch to another node for authentication.

The above are the two main dependencies of WSFC on AD. If your program access account password is correct, but it cannot verify the cluster application normally, you can further check whether the computer object on the node where the application is located is abnormal or not verified according to the expected keranthus. Please check whether the SPN value of the service account and VCO object is correct. If no VCO is used, check CNO. What needs to be noted here is that Clusters do not fail over because your node computer objects are lost or SPN values are lost, so if there is a problem with cluster-based application validation, you need to check the program and AD. Clusters can only provide a unified external name. When a node is down, you can continue to complete validation through this unified external name.

In addition, the computer password reset period of AD group policy sometimes affects the computer object or CNO object cannot be connected normally, so it is recommended to put the cluster computer object and node object into a separate OU, set a computer password reset period for this OU separately, which can be longer, and configure anti-deletion on the cluster OU object or CNO object.

The above briefly introduces the dependence of WSFC on AD, CNO, VCO, login credential verification, computer password reset policy, Kerberos verification, SPN, if there are any incompleteness, welcome to supplement

SMB is a network file sharing protocol that allows applications and end users to access file resources from remote file servers. SMB is different from FTP. It can be transferred directly to each other without the need for upload and download.

In the previous version 2003, 2000 and 2008, everyone's understanding of SMB was probably a shared transfer protocol. The SMB protocol was used for file sharing and copying between the two machines, mainly for client transfer, file server sharing, DFS, etc. However, since 2012, SMB protocol has become more and more important, such as the introduction of SMB multi-channel, SMB Direct(RDMA), SOFS model, and performance has begun to improve greatly. More additions to the enterprise-level scenario began in R2 2012, with the official announcement that CSV metadata transfers within clusters were redirected with CSV traffic by following the SMB protocol, and in 2016 it was announced that storage replication would follow the SMB protocol for each node.

We first pay attention to the dependency of CSV on SMB. Since 2012R2, CSV uses SMB as the metadata interaction and redirection IO traffic between nodes. Here, we briefly introduce CSV. Taking 2012R2 architecture as an example, CSV gives administrators the feeling that it is mounted on all nodes, and all nodes can access CSV volumes. However, in fact, CSV is actually mounted on only one node at a time. This node is called coordinator node. Only the coordinator node can directly interact with the lower NTFS file system. If other nodes want to perform file operations, create files, close files, rename files, change file attributes, delete files, change file size, and any file system control, they need to feed back the metadata information of the operation to be performed and the target file to the coordinator node. Finally, the coordinator node completes the real operation on the file system. CSV is not a file system. It is just an orchestration layer, orchestrating nodes to read and write NTFS or REFS file systems in sequence. There is also a kind of traffic that is redirected. This traffic is terrible. When the node loses the qualification to the physical disk, all write and read operations will be forwarded to the coordinator node for execution. This will bring huge traffic to the coordinator node. At the same time, the applications on the disqualified nodes will also operate inefficiently. This redirected traffic will also be transmitted using SMB protocol.

For metadata interaction and redirected IO, SMB introduced multi-channel mechanism in 2012, cluster will pick the lowest network metric value as CSV traffic, two similar metric values will be executed through SMB multi-channel, by default only cluster communication network metric value is the lowest, metric value numerical rating 2012 also depends on whether the NIC supports RDMA,RSS, administrators can also manually specify network metrics, CSV traffic can take advantage of SMB multi-channel and SMB RDMA technologies.

The above is the official statement, so if the cluster does not have CSV, does not use file sharing witness, and does not provide file sharing service, can you close SMB port? Because last year the virus was very noisy, the company security department proposed to close port 445, so can the cluster support it? Lao Wang made a try. The current cluster does not have CSV, does not use file sharing witness, and does not provide file sharing service. Lao Wang directly disables SMB service.

method of closure

1. Run regedit to open the registry

2. Click HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\NetBT\Parameters in turn

3. Right-click the new "QWORD (64-bit) value" in the blank space on the right, then rename it to "SMBDeviceEnabled", and then change the value of this subkey to 0.

4. Disable system Server services

Restart the computer SMB port 445 is no longer listening

After configuration, although the cluster can start normally, you can see that it is not working properly. For example, DTC application cannot be connected. View log prompt DCOM cannot connect to DTC VCO object.

In fact, Lao Wang has already guessed that this may be the result, because SMB protocol has been operating for many years, and many services depend on it, such as CIFS, SMB, RPC, DFS, Netlogon, etc. This completely shut down method also requires stopping the Server service, and many functions that depend on this service will stop operating.

In addition, there are two points that the official website does not mention. First, the process of client login to the domain will obtain the boot login startup script from netlogon. If the group policy is issued, the user will also obtain the group policy GPT from the SYSVOL directory. These two directories are shared directories above the domain control. The user must access them through the SMB protocol. If SMB is prohibited, the new group policy will not be issued normally.

Second, Lao Wang discovered an interesting thing. In addition to seeing that the default C disk will be shared, the cluster node net share will also share the witness disk if it has its own witness disk. The cluster will not be designed like this for no reason. Although the cluster can start normally after we close SMB, the witness disk is no longer shared. Lao Wang guessed that it will still have some impact on the cluster.

To summarize, if you want to shut down SMB ports, first make sure that CSV is not applied to the cluster, second, you may risk not applying Group Policy, third, the upper application may not work properly, and specifically test against different cluster applications. Only when you are sure that you do not rely on SMB, the Server service application will work properly.

There is also a way to change the SMB port of the cluster. There is a tutorial on modifying SMB ports written on the Internet. However, after modifying it according to the online method, all devices that want to communicate with cluster nodes must also change the SMB port to be consistent with the cluster port. For example, if SQL cluster changes SMB port to 555, then you need to download group policy from AD. However, DC defaults to port 445, so you also need to modify SMB port of DC to 555. Even SQL applications also need to modify SMB port to 555. If the backend SOFS cluster is changed to SMB port, the front-end HV must also be modified, which is too troublesome. If there is no domain in the stand-alone environment, modifying the SMB port to maintain the SMB protocol may be feasible, but in the domain environment, it involves the problem of pulling one to launch the whole system.

Therefore, in Lao Wang's view, the three components of WSFC, AD and SMB are very closely combined. It seems that closing the port or modifying the SMB port is not the best solution at present. It is still necessary to adjust the physical architecture and security policy, such as building the core database application in the DMZ area, strictly controlling communication with other servers through firewalls, updating virus patches and other aspects.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report