Wednesday, March 10, 2010



DFS Replication: What’s new in Windows Server™ 2008

SYSVOL Replication – Now on DFSR

One of the coolest features in Windows Server™ 2008 is – DFSR can now be used for replication of the SYSVOL share between domain controllers. This feature is however limited to domain controllers running in Windows Server 2008 domain functional level. All domain controllers (including a new Windows Server™ 2008 domain controller) operating below this domain functional level will continue to use NTFRS for SYSVOL replication.

Replication Partners
Replication Engine Used
Windows Server 2003 Domain Functional Level
Windows Server 2003 <-> Server 2003
NT File Replication Service
Windows Server 2003 <-> Windows Server 2008
NT File Replication Service
Windows Server 2008 <-> Windows Server 2008
NT File Replication Service
Windows Server 2008 Domain Functional Level
Windows Server 2008 <-> Windows Server 2008
Distributed File System Replication Service

Figure 1: SYSVOL Replication on Windows Server.
DFSR also supports Read Only Domain Controllers for SYSVOL replication. On a Read Only Domain Controller, the DFS Replication service will roll back any changes that have been made locally and will not propagate these changes out to other domain controllers.

NOTE: Read only support does not extend to non-SYSVOL replicated folders. Read only domain controllers are supported as leaf nodes only with no outbound connections from the Read only domain controller. Active Directory automatically configures Read Only Domain Controllers in this manner, so no explicit action is required on the part of the end-user/administrator to comply with the above requirement.


DfsrMig.exe – migrate SYSVOL replication from NTFRS to DFSR

Windows Server™ 2008 also ships a command line tool called dfsrmig.exe which can be used by administrators to migrate from NTFRS replication of the SYSVOL folder to using DFSR for replication of SYSVOL. This can be done once the domain functional level has been raised to Windows Server 2008.

The SYSVOL migration process:

a)       Is simple and requires minimal admin intervention.

b)       Is designed to be low risk and requires minimal downtime of the SYSVOL share.

c)       Provides granular control to administrators so that they can monitor the status of migration and commit to DFSR replication of SYSVOL when everything is working smoothly.

d)       Is reversible and allows the migration process to be rolled back prior to the final commit stage, thus allowing administrators to fall back on NTFRS replication of SYSVOL if desired.

e)       Is robust, fault tolerant and is latency resilient, thus making it suitable for migration of domain controllers located in branch offices as well.

f)        Provides built in monitoring mechanisms which can be used to keep track of the status of migration.

A more detailed blog post covering the steps involved in SYSVOL migration will follow.


Performance gets a boost

Some of the key enhancements in the DFS Replication service in Windows Server™ 2008 are on the performance front. We’ve noticed over the course of working with customers on Windows Server™ R2 based DFSR deployments that there is a scope for enhancing performance while replicating workloads as diverse in nature as the ‘hundreds of small files’ and ‘large file’ workloads. The targeted performance work in Windows Server™ 2008 should benefit deployments of DFSR on heavily loaded hub servers.

1.       “Over-the-wire” enhancements: RPC Asynchronous Pipes

DFSR in Windows Server™ 2008 builds on some wonderful work done in the Windows RPC subsystem to support RPC asynchronous pipes. This has enabled DFSR to optimize at the replication protocol level and implement RPC asynchronous pipe support for replication. For files of size greater than 256K, DFSR uses RPC asynchronous pipes. There are several benefits that can accrue from the usage of asynchronous RPC pipes:

·        Multiple outstanding calls from a replication partner: In Windows Server™ 2003 R2, DFSR on a member receiving updates (ex: a hub server) is blocked in a remote procedure call until the call returns. This prevents the thread servicing that request from having multiple outstanding calls, or performing other work while waiting for the RPC call to return with data. On Windows Server™ 2008, the asynchronous RPC pipes based implementation enables the thread servicing requests to continue to service other outstanding requests from the replication partner while waiting for already issued RPC calls to return with data.

·        Slow or delayed partners: If a replication partner is slow to produce data (for instance a remote DFSR server over a slow link) the DFSR thread servicing that partner doesn’t need to block until data is available. Thus slow replication partners do not end up throttling replication performance.

·        Replication of large amounts of data: Transferring large amounts of data between replication partners, especially over slow links, ties up worker threads at both ends for the duration of the transfer. With RPC asynchronous pipes, this data transfer can take place incrementally, and without blocking either end from performing other tasks.

NOTE: Please note that this feature requires both ends to be running Windows Server™ 2008. In mixed mode deployments where Windows Server™ 2008 as well as Windows Server™ 2003 R2 servers co-exist, the DFS Replication service will default to not using RPC asynchronous pipes for replication activity.

2.       Server-side performance tweaks:

a)       Un-buffered I/O: The DFS Replication service on Windows Server™ 2008 now implements un-buffered local I/Os which increases throughput, since the number of data copy operations that would be effected during the course of regular replication activity is greatly reduced. Not requiring data buffers to be copied multiple times means that much more juice for a heavily loaded hub server which is replicating with multiple replication partners.

b)       Asynchronous Low Priority I/Os: In Windows Server™ 2008, the DFS Replication service is able to leverage a new feature called low priority I/Os. This feature enables background processes such as DFSR which is a Windows service to run with lower-priority access to the hard disk drive than other programs. Applications which are low-priority I/O compatible are able to run on a server without slowing down other high priority processes and thus help improve the responsiveness of a server even though it is dealing with a lot of replication load.

NOTE: As a result of the two above mentioned performance tweaks, the Windows Cache Manager’s buffers are not polluted with replication related data. This drastically reduces the performance impact on the server during heavy replication activity. Thus, running the DFS Replication service on a hub server which is handling large amounts of replication workload will not cause the server to be brought down to a crawl.

c)       16 concurrent file downloads: In Windows Server™ 2008, owing to the usage of asynchronous low priority local I/Os as well as the usage of RPC asynchronous pipes, the number of concurrent file downloads has been bumped up to 16 as against the existing limit of 4 on Windows Server™ 2003 R2. This allows hub servers running Windows Server™ 2008 to download multiple updates from their replication partners concurrently.

Windows Server™ 2003 R2
Windows Server 2008
Multiple RPC calls
RPC Asynchronous Pipes
Synchronous I/Os
Asynchronous I/Os
Buffered I/Os
Un-buffered I/Os
Normal priority I/Os
Low Priority I/Os
4 concurrent downloads
16 concurrent downloads

Figure 2: The performance dashboard
3.       Algorithmic Enhancements:

Based on feedback from some customers who have been using DFSR to replicate data between a central datacenter and multiple remote sites (some of these on slow links), we have enhanced the DFSR service to perform better under these conditions. In experiments conducted to simulate these conditions, it was found that replication member servers on slow links were often throttling the rate at which a datacenter (hub) server was able to collect updates from its replication partners.

Therefore, the algorithm employed to schedule the download of updates has been reworked to distribute the right to send updates more fairly amongst replication partners.



Improved Dirty Shutdown Recovery

In Windows Server™ 2008, recovery from dirty shutdowns is greatly enhanced thus leading to a more resilient DFS Replication service.

The DFS Replication service maintains state information pertaining to the contents of the folders replicated by it in a database on the volume hosting the replicated folder. In this database, it maintains a record of file versions and other metadata which is what enable it to function as a multi-master file replication engine and to automatically resolve conflicts. The DFS Replication service is essentially a consumer of the NTFS USN (Update Sequence Number) journal – a journal of updates to files/folders maintained by NTFS. Entries in this journal notify the DFS Replication service about changes occurring to the contents of a replicated folder and end up triggering replication activity. Every unique change occurring on the file system (pertaining to a folder replicated by DFSR) will trigger the creation/update of a record in the DFSR database as well.
Sometimes, it is possible that the database and the file system go out of sync. Examples of such scenarios are abrupt power loss on the server or if the DFSR service were terminated abnormally for any reason. Another example is if the volume hosting a DFSR replicated folder loses its power, gets disconnected or is forced to dismount. These eventualities result in ‘Dirty Shutdown’ of DFSR and can cause inconsistencies between the database and the file system. DFSR is designed to automatically recover from these situations and will validate the contents of the replicated folder against entries stored in the database for consistency. This is achieved by implementing USN check-pointing, which is a way of keeping track of which USN changes have been committed to the database.
If a dirty shutdown is detected on Windows Server™ 2003 R2, the DFS Replication service triggers an expensive rebuild of the database which is time consuming. Therefore, the service takes more time to recover from dirty shutdowns. In Windows Server™ 2008, sophisticated validation algorithms have been incorporated which reduce the requirement of expensive database rebuilds to the extent possible. Thus, the recovery process has been enhanced to perform much better and will automatically recover from dirty shutdown conditions without consuming as much time as on Windows Server™ 2003 R2.
And … an old ‘formula’ is retired!

In Windows Server™ 2003 R2, there was a scalability limit defined by the below formula. The blog post which explains this in detail can be found here.

On each server, the result of the following formula should be kept to 1024 or fewer:


(number of replicated folders in replication groupx * number of simultaneously replicating connections in replication groupx) + (number of replicated folders in replication groupy * number of simultaneously replicating connections in replication groupy) + (number of replicated folders in replication groupn * number of simultaneously replicating connections in replication groupn)

This formula came into existence on R2 because the DFS Replication service made use internally of a library to maintain performance counters. This library had a hard limit on the number of performance counter objects which could be created – yes, 1024. Since DFSR uses performance counters extensively for each replicated folder and for keeping track of state information pertaining to every connection with a replication partner, this limit of 1024 affected scalability.

With Windows Server™ 2008, the DFS Replication service has been upgraded to use the new and improved version of this library. This version doesn’t have the limitation of 1024 performance counter objects (unlimited performance counter objects can be created) and therefore the above formula is no applicable on Windows Server™ 2008.

Tuesday, March 9, 2010


Key DFS improvements in Windows Server 2008 R2


By Brien M. Posey, Contributor



Distributed File System (DFS) has been part of Windows Server for many years, and while it has matured with time, certain scalability problems have remained.
For example, while DFS can be scaled to work in large environments, you'll find that configuring, managing and troubleshooting the system becomes exponentially more complicated as the size of the deployment increases.
Fortunately, Microsoft has addressed these scalability and troubleshooting problems with Windows Server 2008 R2.
Scalability enhancements
For the most part, the scalability improvements made to DFS in Windows Server 2008 R2 are best suited for organizations with multiple branch offices.


These types of organizations often have ahub and spoke topology, which means their DFS servers are located in a main office – the hub – with the servers' contents replicated to smaller DFS servers in branch offices.
The problem with this type of architecture is that if a DFS server in the main office fails catastrophically, then all of the branch offices could be impacted.
While creating additional replicas in the main office would be one solution, depending on the replication topology's configuration, these additional replicas may not be able to push updates to the branch offices.
Another solution would be a full mesh topology; however, this path is often avoided because of the expense of the extra WAN links and the volume of replication traffic.
Ultimately, the answer for many organizations is to create a replication with two hub members. With this in mind, it's obvious why one of the most welcome new features in R2 is DFS support for failover clusters. Basically, clustering the hub servers in the main office can prevent the branch office replicas from becoming cut off by a hub server failure.
Another improvement in DFS with Windows 2008 R2 is the ability to create read-only replicated folders.
In the past, if you needed a user at a branch office to access data in a replicated folder -- but you didn't want them to change the data -- you needed to use access control lists (ACL) to grant the particular user read-only access. This requires a lot of administrative effort, especially if the branch offices have a high employee turnover.
A new alternative is to create DFS replicas in the branch offices that contain read-only replicated folders, which has the same effect as granting users read-only access to a traditional replicated folder.
New troubleshooting capabilities
In the updated version of DFS, Microsoft also extended the dfsrdiag.exe command line tool to include new functionalities.
The first of these extensions is the file hash function (DFSRDIAG.EXE FILEHASH). With this function you can compare the authoritative copy of a file against its replicated self by seeing if the file hashes are identical.
Furthermore, the new Replication State function (DFSDIAG.EXE REPLSTATE) allows you to analyze the current state of the replication service. With this, you can see what files are being updated on replication partners.
The basic idea behind another new function, ID Record (DFSRDIAG.EXE IDRECORD), is that every file and folder within a replicated folder has a corresponding ID record within the server's database, which is linked to valuable data like version and timestamp information.
With this function, you can determine a file or folder's record number and extract data bound to that record. This capability can be extremely helpful if you want to compare files stored on DFS replicas for consistency.
Overall, the changes Microsoft has made to DFS in Windows Server 2008 R2 should improve scalability and make the DFS easier to troubleshoot.
Read-only replicated folders on Windows Server 2008 R2
Why deploy read-only replicated folders?
Consider the following scenario. Contoso Corporation has a replication infrastructure similar to that depicted in the diagram below. Reports are published to the datacenter server and these need to be distributed to Contoso’s branch offices. DFS Replication is configured to replicate a folder containing these published reports between the datacenter server and branch office servers.
The DFS Replication service is a multi-master file replication engine – meaning that changes can be made to replicated data on any of the servers taking part in replication. The service then ensures that these changes are replicated out to all other members in that replication group and that conflicts are resolved using ‘last-writer-wins’ semantics.
AccidentalDeletions
Now, a Contoso employee working in a branch office accidentally deletes the ‘Specs’ sub-folder from the replicated folder stored on that branch office’s file server. This accidental deletion is replicated by the DFS Replication service, first to the datacenter server and then via that server to the other branch offices.
DeletionOnHubServer
Soon, the ‘Specs’ folder gets deleted on all of the servers participating in replication. Contoso’s file server administrator now needs to restore the folder from a previously taken backup and ensure that the restored contents of the folder once again replicate to all branch office file servers.
Administrators need to monitor their replication infrastructure very closely in order to prevent such situations from arising or to recover lost data if needed. Strict ACLs are a way of preventing these accidental modifications from happening, but managing ACLs across many branch office servers and for large amounts of replicated data quickly degenerates into an administrative nightmare. In case of accidental deletions, administrators need to scramble to recover data from backups (often up-to-date backups are unavailable) and in the meantime, end-users face outages leading to loss of productivity.
ReadOnlyDeployment
This situation can be prevented by configuring read-only replicated folders on branch office file servers. A read-only replicated folder ensures that no local modifications can take place and the replica is kept in sync with a read-write enabled copy by the DFS Replication service. Therefore, read-only replicated folders enable easy-to-deploy and low-administrative-overhead data publication solutions especially for branch office scenarios.
How does all this work?
For a read-only replicated folder, the DFS Replication service intercepts and inspects every file system operation. This is done by virtue of a file system filter driver that layers above every replicated folder that is configured to be read-only. Volumes that do not host read-only replicated folders or volumes hosting only read-write replicated folders are ignored by the filter driver.
  • Only modifications initiated by the service itself are allowed – these modifications are typically caused by the service installing updates from its replication partners. This ensures that the read-only replicated folder is maintained in sync with a read-write enabled replicated folder on another replication partner (presumably located at the datacenter server).
  • All other modification attempts are blocked – this ensures that the contents of the read-only replicated folder cannot be modified locally. As shown in the below figure, end-users are unable to modify the contents of the replicated folder on servers where it has been configured to be read-only. The behavior is similar to that of a read-only SMB share – contents can be read and attributes can be queried for all files, however, modifications are not possible.

DeletionBlocked
A note on connections
Please note that connections between replication members should continue to be two-way connections.Microsoft does not recommend or support the configuration of one-way connections between replication members.
The DFS Replication service prevents local modifications to replicated data on members hosting read-only replicated folders. Also, the service ensures that absolutely no changes are replicated out from a member hosting a read-only replicated folder to other replication member servers. Therefore, there is no fear of unwanted changes replicating out from a member server configured to be read-only. As a result of these features, we recommend setting up two-way connections even if one of the replication partners hosts a read-only replicated folder. The outbound connection from the member server hosting the read-only replicated folder will only be used for version vector comparison and the service will ensure that no changes are replicated out.
Therefore, the read-only replicated folders feature precludes the need for configuring one-way replication using one-way replication connections between member servers.
Deployment configurations
In a given replication group, a member server hosting a read-only replicated folder must be connected to a replication partner hosting the corresponding read-write enabled replicated folder.
ValidConfigurations
Therefore, it is not possible to connect two members hosting read-only replicated folders to each other. On Windows Server 2008 R2, the DFS Management console performs appropriate connection topology validation to ensure that this requirement is met.
InvalidConfigurations 

Core Configurator 2.0 for Windows 2008 R2



Core Configurator 2.0 is now available to download from: http://coreconfig.codeplex.com/
If you are unfamiliar with this tool – it is a graphical tool that allows you to configure a whole bunch of system settings on a Windows Server Core installation:
Pics
Core Configuration tasks include:
  • Product Licensing
  • Networking Features
  • DCPromo Tool
  • ISCSI Settings
  • Server Roles and Features
  • User and Group Permissions
  • Share Creation and Deletion
  • Dynamic Firewall settings
  • Display | Screensaver Settings
  • Add & Remove Drivers
  • Proxy settings
  • Windows Updates (Including WSUS)
  • Multipath I/O
  • Hyper-V including virtual machine thumbnails
  • JoinDomain and Computer rename
  • Add/remove programs
  • Services
  • WinRM
  • Complete logging of all commands executed

DFS Improvements in Windows 2008 R2



Windows 2008 R2 will feature some major DFS improvements, following is an explanation of them:
1. Support for Windows Failover Clusters
In Windows Server 2008 R2, Windows Failover clusters can be configured to be part of a replication group. Windows Failover clustering technology enables administrators to configure services and applications to be highly available.
Busy hub servers located in the datacenter that replicate with many branch office servers are perfect candidates for clustered DFS Replication. These servers are critical to the replication infrastructure and administrators expect high availability from these servers. A failure (hardware/software) on such crucial servers has the potential to bring all replication activity to a standstill.
2. Read-only Replicated Folders
Often, customers use the DFS Replication service to publish data from a central server out to many branch office servers. A typical characteristic of this data is that it is created/modified at one location (typically the hub/datacenter server) and changes aren’t expected to occur on any of the other member servers. Usually, administrators configure strict ACLs for the replicated data to ensure that changes aren’t made by end-users in branch offices.
Configuring and maintaining strict ACLs to block accidental modifications or recovering data that has been accidentally deleted entail high administrative overheads. A new feature in Windows Server 2008 R2 called ‘Read-only replicated folders’, offers an easy to manage solution to this problem.
In essence, on a read-only replica:
  • Local modifications are blocked by the DFS Replication service. Changes to files/folders including creation, deletion, modification of attributes/permissions etc. are not possible. Semantically, the read-only replicated folder mimics a read-only share.
  • Changes from members hosting read-write copies are replicated in. The DFS Replication service replicates in changes from other replication partners that host read-write enabled copies of the replicated folder. This ensures that the data remains up to date on the read-only replica.
3. SYSVOL on Read-only Domain Controllers
Windows Server 2008 introduced support in the DFS Replication service for Read-only Domain Controllers (RODC).
Building on this implementation, in Windows Server 2008 R2 the concept of read-only replicated folders has been extended to the SYSVOL replicated folder. This means that on read-only domain controllers running Windows Server 2008 R2 and using the DFS Replication service to synchronize the SYSVOL share, the SYSVOL share is configured as a read-only replicated folder. Therefore, there is a change in the end-user experience for the SYSVOL share exposed by a Windows Server 2008 R2 domain controller.
On Windows Server 2008 based RODC: Changes can be made to the contents of the SYSVOL share on RODCs. However, the DFS Replication service monitors these changes and then asynchronously overwrites these changes with updates from a writable domain controller. Therefore, any changes made on a Windows Server 2008 RODC will be visible for a short duration, until they are reverted by the DFS Replication service.
On Windows Server 2008 R2 based RODC: Changes cannot be made to the contents of the SYSVOL share on RODCs. Any attempts to make such modifications will encounter an ACCESS DENIED error. Therefore, the SYSVOL share on such RODCs looks like a read-only share.
4. Diagnostics Improvements
Windows Server 2008 R2 also features a set of powerful enhancements to the diagnostics capabilities of the DFS Replication service. These are in the form of new command line options to the dfsrdiag.exe command line diagnostics tool.
Dfsrdiag.exe ReplState : This command line switch provides an insight into the current working state of the DFS Replication service. This command initiates a snapshot of the internal state of the service and thereby gathers a list of the updates that are currently being processed (downloaded from or served out to replication partners) by the service.
Using this command line switch, an administrator can retrieve a snapshot of the status of replication activity across all connections on a given DFS Replication member server.
Dfsrdiag.exe IdRecord: The DFS Replication service maintains a record for each file and folder in the replicated folder in its database. These are known as ID records. This command line switch can be used to display the ID record information maintained by the DFS Replication service corresponding to a particular file/folder in its database. The ID record also contains information such as version vectors of the file/folder, timestamps etc.
Using this command line switch, an administrator can dump the ID record corresponding to a file/folder on each individual replication member server. In order to check whether a particular update has replicated to the member servers in the replication group, the version information in the ID records on these members can be compared.
Dfsrdiag.exe FileHash: This command line switch computes and displays the file hash generated by the DFS Replication service for a particular file. The file hash can be used to compare two files and determine whether they are similar.
For instance, while pre-seeding the contents of a newly added replicated folder, it is often required to verify whether the file being pre-seeded is identical (attributes, timestamps, ACLs etc.) to that on the authoritative copy being used as the source for pre-seeding. By comparing the file hash generated by this command line option for two files, it is possible to verify if the files are identical. If the data being pre-seeded is identical to that contained on the source server, the DFS Replication service will complete initial sync much faster. This is because it does not need to download the file data and merely downloads metadata.