How to analyze the consistency Design principle of NFS File Lock 07/02 Update SLTechnology News&Howtos

How to analyze the consistency Design principle of NFS File Lock

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article introduces you how to analyze the principle of NFS file lock consistency design, the content is very detailed, interested friends can refer to, hope to be helpful to you.

File lock

File lock is one of the most basic features of file system. With the help of file lock, applications can control the concurrent access of other applications to files. As a standard network file system of UNIX-like systems, NFS supports file locks natively (starting from NFSv4) in the process of development. Since its birth in the 1980s, NFS has released three versions: NFSv2, NFSv3 and NFSv4. The biggest change in NFSv4 is the "state". Some operations require the server to maintain the relevant state, such as a file lock, for example, if the client applies for a file lock, the server needs to maintain the status of the file lock, otherwise conflicting access from other clients cannot be detected. If it is NFSv3, you need the assistance of NLM to realize the file locking function, but sometimes it is easy to make mistakes if the two are not coordinated enough. NFSv4 is designed as a stateful protocol, which can realize the file lock function by itself, so there is no need for NLM protocol.

Application interface

The application can manage NFS file locks through fcntl () or flock () system calls. Here is the call procedure for NAS to acquire file locks when it is mounted using NFSv4:

From the above diagram call stack, it is easy to see that the NFS file lock implementation logic basically reuses the VFS layer design and data structure. After successfully obtaining the file lock from Server through RPC, call the locks_lock_inode_wait () function to hand over the file lock to the VFS layer management. There are many related materials about the VFS layer file lock design, which will not be described here.

EOS principle

File lock is a typical non-idempotent operation, and the retry and Failover of file lock operation will lead to inconsistency of file lock status view between client and server. NFSv4 designs a mechanism that can be executed at most once with the help of the SeqId mechanism, as follows:

For each open/lock state Client and Server simultaneous independent maintenance seqid,Client when initiating an operation that will cause a state change (open/close/lock/unlock/release_lockowner), the seqid is incremented by 1 and sent to the Server as a parameter. Assuming that the seqid sent by Client is the seqid maintained by RMagne Server is L, then:

1) if R = = L + 1, it indicates a legitimate request, and it should be handled normally.

2) if R = L, it means a retry request, and Server can return the cached reply.

3) all other cases are illegal requests and visits are refused.

According to the above rules, Server can determine whether the operation is normal, retry, or illegal request.

This method can ensure that each file lock operation can be performed at most once on the server side, which solves the problem of repeated execution caused by RPC retry, but this alone is not enough. For example, the calling thread is interrupted by the signal after the LOCK operation is sent, and then the server successfully accepts and executes the LOCK operation, so that the server records that the client holds the lock, but the client does not maintain the lock because of the interruption, and the lock status view between the client and the server is inconsistent. Therefore, the client also needs to cooperate with the exception scenario in order to achieve file lock view consistency.

Exception handling

From the analysis in the previous section, we can see that the client needs to cooperate with handling exception scenarios in order to ensure the consistency of the file view, so what kind of cooperative design have the client designers done? At present, the client mainly implements two dimensions of SunRPC and NFS protocol to solve this problem. The following describes how the design of these two dimensions ensures the consistency of the file lock status view.

SunRPC design

SunRPC is a network communication protocol specially designed by Sun for remote procedure calls. Here, let's take a look at the design concept at the implementation level of SunRPC from the dimension of ensuring the consistency of file lock views:

1) the client uses xid of int32_t type to identify each remote procedure call initiated by the upper consumer, and multiple RPC retries of each remote procedure call use the same xid identity, which ensures that any one of the multiple RPC retries can inform the upper layer that the remote procedure call has been successful, and that the server takes a long time to execute the remote procedure call and can get the result. This is different from the traditional netty/mina/brpc, etc., which requires that each RPC should have an independent xid/packetid.

2) the server has designed DRC (duplicate request cache) to cache the recently executed RPC results. When receiving a RPC, the server will first retrieve the DRC cache through xid. If it is hit, it indicates that the RPC is a retry operation, and the cached result can be returned directly, which avoids the problem of repeated execution caused by RPC retry to a certain extent. In order to prevent xid reuse from causing DRC cache to return unexpected results, developers can further effectively reduce the probability of errors caused by reuse through the following designs:

A) when the client establishes a new link, the initial xid adopts a random value:

B) the server DRC records the requested verification information in addition, and verifies this information when the cache hits

3) the client is allowed to retry indefinitely before obtaining the server-side response to ensure that the caller can get the server-side deterministic execution result. Of course, such a strategy will cause the caller to hang all the time when there is no response.

4) NFS allows users to specify the retry policy of SunRPC through the soft/hard parameter when mounting, where soft mode forbids retry after timeout, while hard mode continues to retry. When users mount in soft mode, the NFS implementation does not guarantee the consistency of the state view between the client and the server. When a remote procedure call returns timeout, the application is required to cooperate with the cleaning and recovery of the state, such as closing the files with access errors. However, in practice, few applications will cooperate, so generally NAS users use hard mode to mount.

In a word, one of the core problems to be solved by SunRPC is that the execution time of remote procedure calls is uncontrollable. Protocol designers design this customization to avoid the side effects caused by non-idempotent operation RPC retry as far as possible.

Signal interruption

The application is allowed to be interrupted by the signal while waiting for the result of the remote procedure call. When a signal interruption occurs, because the execution result of the remote procedure call is not obtained, the state of the client and the server are likely to be inconsistent, for example, the locking operation has been successfully executed on the server, but the client is not aware of the situation. This requires the client to do extra work to restore the state to the server. The following is a brief analysis of the processing after the acquisition file lock is interrupted by the signal to illustrate the consistent design of the implementation level of the NFS protocol.

Through the process of acquiring the NFSv4 file lock, we can see that the NFSv4 acquires the file lock eventually calls the _ nfs4_do_setlk () function to initiate the RPC operation, and finally calls nfs4_wait_for_completion_rpc_task () to wait. Here is the relevant code:

5684 static int _ nfs4_do_setlk (struct nfs4_state state, int cmd, struct file_lock fl, int recovery_type)

5685 {

5718 task = rpc_run_task (& task_setup_data)

5719 if (IS_ERR (task))

5720 return PTR_ERR (task)

5721 ret = nfs4_wait_for_completion_rpc_task (task)

5722 if (ret = = 0) {

5723 ret = data- > rpc_status

5724 if (ret)

5725 nfs4_handle_setlk_error (data- > server, data- > lsp)

5726 data- > arg.new_lock_owner, ret)

5727} else

5728 data- > cancelled = 1

. }

Copy

By analyzing the implementation of nfs4_wait_for_completion_rpc_task (), it is known that when ret < 0, it indicates that the lock acquisition process is in the signal and is recorded by the cancelled member of struct nfs4_lockdata. Continue to view the callback function nfs4_lock_release () when released after the completion of rpc_task:

As can be seen from the code in the red box above, when nfs4_lock_release () detects a signal interrupt, it will call the nfs4_do_unlck () function to try to release the file lock that may have been successfully acquired. Note that the nfs_free_seqid () function is not called to release the held nfs_seqid at this time, in order to:

1) to ensure that there will be no new concurrent lock or release lock operation initiated by the user in the process of correcting the state, and simplify the implementation.

2) ensure that the UNLOCK operation in hard mode will only be sent after the LOCK operation returns, and that the lock can be released.

Through the above method, the client can effectively ensure the final consistency of the lock state between the client and the server after the signal interruption, but also at the cost of losing part of the availability.

File lock is a basic feature of file system native support. NAS, as a shared file system, has to face the problem of consistency of lock status views between client and server. NFSv4.0 has solved this problem to a certain extent. Of course, the pace of technological progress will not stop, the iteration of NFS updates will not stop, and there will be more expectations for NFS in the future.

On how to do NFS file lock consistency design principle analysis is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.