Distributed File System (DFS) is a set of client and server services that allow an organization using Microsoft Windows servers to organize many distributed SMB file shares into a distributed file system. DFS provides location transparency and redundancy to improve data availability in the face of failure or heavy load by allowing shares in multiple different locations to be logically grouped under one folder, or DFS root.
There is no requirement to use the two components of DFS together; it is perfectly possible to use the logical namespace component without using DFS file replication, and it is perfectly possible to use file replication between servers without combining them into one namespace.
A DFS root can only exist on a server version of Windows (from Windows NT 4.0 and up) and OpenSolaris (in kernel space) or a computer running Samba (in user space.) The Enterprise and Data center Editions of Windows Server can host multiple DFS roots on the same server. OpenSolaris intends on supporting multiple DFS roots in “a future project based on Active Directory (AD) domain-based DFS namespaces”.
There are two ways of implementing DFS on a server:
- Standalone DFS namespace allow for a DFS root that exists only on the local computer, and thus does not use Active Directory. A Standalone DFS can only be accessed on the computer on which it is created. It doesn’t offer any fault tolerance and cannot be linked to any other DFS. This is the only option available on Windows NT 4.0 Server systems. Standalone DFS roots are rarely encountered because of their limited utility.
- Domain-based DFS namespace stores the DFS configuration within Active Directory, the DFS namespace root is accessible at \domainname<dfsroot> or \fq.domain.name<dfsroot>. The namespace roots do not have to reside on domain controllers, they can reside on member servers, if domain controllers are not used as the namespace root servers, then multiple member servers should be used to provide full fault tolerance.
DFS Replication, the successor to the File Replication service (FRS) introduced in Windows 2000 Server operating systems, is a new, state-based, multimaster replication engine that supports replication scheduling and bandwidth throttling. DFS Replication uses a new compression algorithm known as remote differential compression (RDC). RDC is a “diff-over-the wire” client-server protocol that can be used to efficiently update files over a limited-bandwidth network. RDC detects insertions, removals, and re-arrangements of data in files, enabling DFS Replication to replicate only the changed file blocks when files are updated.
DFS Replication uses many sophisticated processes to keep data synchronized on multiple servers. Before you begin using DFS Replication, it is helpful to understand the following concepts.
- DFS Replication is a multimaster replication engine. Any change that occurs on one member is replicated to all other members of the replication group.
- DFS Replication detects changes on the volume by monitoring the update sequence number (USN) journal, and DFS Replication replicates changes only after the file is closed.
- DFS Replication uses a staging folder to stage a file before sending or receiving it. For more information about staging folders, see Staging folders and Conflict and Deleted folders.
- DFS Replication uses a version vector exchange protocol to determine which files need to be synchronized. The protocol sends less than 1 kilobyte (KB) per file across the network to synchronize the metadata associated with changed files on the sending and receiving members.
- When a file is changed, only the changed blocks are replicated, not the entire file. The RDC protocol determines the changed file blocks. Using default settings, RDC works for any type of file larger than 64 KB, transferring only a fraction of the file over the network.
- DFS Replication uses a conflict resolution heuristic of last writer wins for files that are in conflict (that is, a file that is updated at multiple servers simultaneously) and earliest creator wins for name conflicts. Files and folders that lose the conflict resolution are moved to a folder known as the Conflict and Deleted folder. You can also configure the service to move deleted files to the Conflict and Deleted folder for retrieval should the file or folder be deleted. For more information, see Staging folders and Conflict and Deleted folders.
- DFS Replication is self-healing and can automatically recover from USN journal wraps, USN journal loss, or loss of the DFS Replication database.
- DFS Replication uses a Windows Management Instrumentation (WMI) provider that provides interfaces to obtain configuration and monitoring information from the DFS Replication service.