Abstract
NFS servers implement the Network Lock Manager protocol so that multiple clients can share access to files and directories without risking corruption due to competing updates. In a single NFS server, implementing NLM is relatively straightforward, but in clustered servers, things become more complicated during normal operation and even more so when failure of a cluster node necessitates lock recovery. This presentation briefly describes NLM protocol semantics and operation, outlines the complications introduced with clustered NFS servers, presents some typical solutions and issues with them, and concludes by describing a novel solution which both outperforms and is able to deliver more robust guarantees of correctness than conventional approaches.