Real-World Scripting: Data Migration with Robocopy, Part 2

When a script is to perform a major IT undertaking such as data migration, you need to write a script that not only performs the task but also minimizes associated risks and downtime. This two-part series shows you how to write such a script.

Dick Lewis

March 25, 2001

8 Min Read
ITPro Today logo in a gray background | ITPro Today


Windows Shell Scripting

When a script is to perform a major IT undertaking such as data migration, you need to write a script that not only performs the task but also minimizes associated risks and downtime. RobocopyDataMigration.bat is an example of such a script. This script

  • Copies the entire file and folder structure from the old array to the new array while maintaining complicated NTFS file and folder permissions

  • Creates a detailed log of the migration results for each top-level folder

  • Verifies the results of the copy operation

  • Determines how long the system will need to be offline for the data migration

  • Permits graceful fallback if hardware fails or other unforeseen circumstances arise during the data migration

Last month, I began describing how RobocopyDataMigration.bat works. Specifically, I discussed how the script uses Robocopy 1.96 to perform the copy operation and the Now utility to create a detailed log. (You can find both utilities in the Microsoft Windows 2000 Resource Kit or the Microsoft Windows NT Server 4.0 Resource Kit Supplement 4.) This month, I show you how the script verifies the copy operation results, determines offline time, and permits graceful fallback. I also show you how to adapt the script for your environment.

Verifying Results
The Robocopy utility produces a detailed log of copy operation successes and failures. To further verify these results, RobocopyDataMigration.bat uses the resource kits' Diruse utility to obtain data on directory and file space usage. Listing 1, page 2, shows the script's syntax for Diruse. In this syntax, the /b switch tells Diruse to capture disk usage in bytes and the /s switch specifies the inclusion of subdirectories. The code Find "TOTAL" tells Diruse to report the total number of bytes and total number of files in the specified folder (in this case, the source folder).

To capture the byte and file counts from Diruse's output, the script uses the For command. As the syntax for parsing Diruse's output in Listing 1 shows, the /f switch of the For command parses Diruse's output. Then, the For command captures the byte and file counts and assigns those counts to the srcbytes and scrfiles variables, respectively.

Next, RobocopyDataMigration.bat captures the destination folder's byte and file counts and assigns them to the destbytes and destfiles variables, respectively. The script then uses a string comparison to compare the source folder's counts with the destination folder's counts. Having an equal number of bytes and files in both the source and destination folders is a reliable indicator of a successful copy operation. Note that if you run RobocopyDataMigration.bat with Robocopy's /l switch in place, the string comparison will fail because no files are copied. (As I described in "Data Migration with Robocopy, Part 1," the /l switch tells Robocopy to only list the files, folders, and permissions it would have copied.)

The script writes the byte and file comparison results to the log. In addition, the script uses the comparison results to rename the log so that you know the results by simply looking at the log's filename.

Predicting Downtime
The first time that you use the Robocopy utility in a live run, it copies an entire file and folder structure to the destination folder. Depending on the size of the data area and copy throughput, this operation can take some time. On subsequent runs, Robocopy copies only new and changed files. As a result, Robocopy is much faster than other copy utilities (e.g., Xcopy) that copy all the data every time.

Timing the copy operation can help you predict the amount of time the server will be offline for the migration. You can time how long any command or script takes to execute with the resource kits' Timethis utility. RobocopyDataMigration.bat uses Timethis with Robocopy to record how long the copy operation takes (i.e., the elapsed time). Listing 1 contains the syntax for using Timethis with Robocopy. In this syntax, the code >>C:logfile.txt redirects the output of Timethis and Robocopy to the specified log. RobocopyDataMigration.bat then captures the elapsed time from the log and changes the log's filename to incorporate that data. That way, you can look at the filename to learn how long the copy operation took.

To predict downtime, you need to run RobocopyDataMigration.bat during off hours several times within a short period (e.g., 3 days). After the initial copy run and a couple of runs to copy subsequent changes, predicting future script run times becomes easy. For example, suppose Max, a systems administrator at the XYZ company, needs to transfer data from an old array to a new array. The new array must be operational by 8:00 a.m. on Monday, April 2. About a week before that date, Max schedules RobocopyDataMigration.bat to run every night at 8:00 p.m. Max observes that after users have made changes over a 24-hour period, the script takes about an hour to complete. On a few nights, Max also schedules the script to run at 9:00 p.m. Max concludes that if he runs the script an hour after a previous run, the script takes about 15 minutes to complete.

Max now knows that the final copy operation will take about 15 minutes. Using the resource kits' Rmtshare utility to recreate the shares and their permissions will take another 15 minutes, so Max determines that the server must be offline for 30 minutes the evening before the new array officially becomes operational. Max intentionally plans the rollover to occur on the weekend to minimize the effect on users. However, sometimes users work on weekends, so Max informs all users that the server will be offline from 10:00 p.m. to 10:30 p.m. on Sunday.

Sunday arrives. At 8:00 p.m., Max launches the script to copy any new files and changes in the past 24 hours. This run completes in about 1 hour as projected. However, Max notices that some users are still connected and have files open.

At 9:40 p.m., Max launches another run to update the new array with the files that users have added or changed since the previous run at 8:00 p.m. Although this run might not be necessary, Max wants to make sure the downtime doesn't exceed 30 minutes. Just before 10:00 p.m., this second run completes, and Max uses the Net Send /users command to send a message to any connected users that the server is going offline at 10:00 p.m.

At 10:00 p.m., Max removes the old shares, stops the server service, and disconnects any connected users. He launches the script for the last time. The script completes in 15 minutes as projected. Max then checks the log. The byte and file counts match and the log doesn't contain any problems, so he starts the server service. Finally, he recreates the shares and applies the appropriate share permissions. By 10:30 p.m., the new array is online and operating smoothly.

Permitting Fallback
Hardware occasionally fails, and you might need to fall back to the old system. You can use RobocopyDataMigration.bat to copy files from the new array back to the old array. That way, you can quickly revert back to the old drives yet provide current user data.

Adapting the Script
You can find RobocopyDataMigration.bat in the Code Library on the Windows Scripting Solutions Web site (http://www.winscriptingsolutions.com). Here are the steps to modify the script for your environment:

  1. Configure the pathname for the log. In the line

    Set LOGLOC=C:temp

    replace C:temp with your log's pathname.

  2. Configure the pathname to the resource kit, if necessary. If you've installed the NT resource kit properly, the NT resource kit created an environment variable called NTResKit. To determine whether you have this variable, type

    Set

    at the command line. If the variable NTResKit appears in the list of variables, you don't need to set a pathname and can skip this step. If you don't see this variable or you've installed the Win2K resource kit, you need to set a pathname. Add the line

    Set NTRESKIT=C:treskit

    where C:treskit is your resource kit's pathname. (The script specifies where to add this line.) If you've installed the Win2K resource kit, check to see whether the folder name has spaces. If it does, consider reinstalling the resource kit into a folder whose name has no spaces.

  3. Configure the pathname for the source folder. In the line

    Set SOURCE=E:test

    change test to your source folder's name. Be sure to retain the last backslash. (The pathname must end with a backslash.)

  4. Configure the pathname for the destination folder. In the line

    Set DESTINATION=F:test

    change test to the name of your destination folder. Be sure to retain the last backslash.

  5. Configure the top-level folders. In the code that Listing 2 shows, replace Sales Marketing Managers Accounting "Human Resources" Users with the names of your top-level folders. If a folder name includes a space, enclose that name in double quotes. As a general rule, minimize spaces in top-level folders if possible.

  6. Run the script in a user context that has the necessary permissions to read from the source area and write to the destination area. If you use Task Scheduler to run RobocopyDataMigration.bat, confirm that you're running the script in a user context that has the correct permissions.

A Good Example
As RobocopyDataMigration.bat illustrates, sometimes a script has to do more than perform a task to ensure the success of a major IT undertaking. In the case of RobocopyDataMigration.bat, the script had to create detailed logs, verify results, predict downtime, and permit graceful fallback.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like