Real-World Scripting: Deleting Files by Their Age
This monthly column offers practical scripts for systems administrators. This installment provides a script that deletes old files.
May 19, 2000
A common task that systems administrators perform is keeping public shares clean to reduce future disk-space requirements and simplify navigation for users. Even with disk quotas in place, having an automated process that deletes old files can be helpful. With management support, a good backup policy, and an automated script, you can safely rid your file servers of files that are beyond a specified threshold date.
For example, suppose you're in charge of disk utilization on several resource domain servers that house numerous files for your 1000-plus user community. You've been experiencing dramatic growth in the size of the shared folders. Because disk-space expansion requires server downtime and additional resources, management has asked you to remove files that are more than a year old on the file shares.
Although you could use the GUI's Find files and folders command to manually find and delete old files, you prefer to automate this task with a script. The script has four requirements:
The script must delete old files on specific drives or shares.
The script must delete old files in all subdirectories.
The threshold date must be based on the file's last-modified date and be easily configurable for different servers.
The script must log all deletions.
In addition to these requirements, you have one item on your wish list: Because management occasionally asks you to delete certain file types from your servers (e.g., .pst files, .exe files), you want to be able to easily modify the script to handle this task.
To meet the requirements, you need to obtain each file's modification age, in days. Table 1 shows some common automated file tasks and the Perl file tests and Windows NT scripting commands that you can use to accomplish those tasks. As Table 1 shows, Perl's -M file test provides a file's modification age in days. If you use NT shell, you must first use the Dir command to obtain the modification date, then calculate the file's modification age from that date. This operation is difficult because NT shell has limited math functions. Thus, you decide to write the script in Perl.
After you obtain each file's modification age, you need to compare that value with the threshold value. As the code in Listing 1 shows, this threshold value is 365. In this code, the -f file test determines whether the entry (in this case, C:\test.txt) is a file. The -M file test provides the file's age. You then determine whether that value is greater than the specified threshold. If the entry is a file and the entry is greater than 365, you receive the message C:\test.txt is a file and older than 365 days. It can be deleted. If the entry doesn't meet one or both of these criteria, you receive the message C:\test.txt is a folder or younger than 365 days. It is not subject to deletion.
You can easily adapt this code. If you want to delete empty directories (i.e., folders), you can use the -d file test instead of the -f file test in the if statement at callout A in Listing 1. To delete files with particular extensions or filenames, you can replace the code at callout A with a new if statement. For example, if you want to delete .pst files, you can use the new if statement
if ( -f $_ && $_ =~ /.pst$/i ) {
The code .pst$ specifies you're looking for .pst at the end of the path. This approach prevents the incorrect deletion of files whose filenames happen to include .pst. If you adapt the code to delete files with particular extensions or filenames or to delete directories, you also need to change the messages in the print lines.
The code in Listing 1 locates the old files. Now, you need to add code to delete those files and output the results to a log file. To delete files, you add the code
unlink ($object);
to the if statement. The unlink command deletes only files. This command deletes files even if you've set the read-only attribute. If you adapt the code to delete directories, you need to use the rmdir command. (Like the NT shell's equivalent Rmdir command, rmdir deletes only empty directories.) If you're deleting a directory with rmdir and you've previously set the read-only attribute, you need to change that attribute before rmdir can delete the folder. Use the code
chmod (0666, "$object ");
before deleting the folder.
The code in Listing 2 outputs the results to a log file that you've created and put in the desired location. As callout A in Listing 2 shows, you open the log file at the start of the script. Then, you print to that file during the script's execution, as callout B in Listing 2 shows. Finally, you close the file at the script's completion, as callout C in Listing 2 shows. This approach is better because, if the open command fails to open the log file, the die command closes the script and returns an error message explaining the reason for the failure. A common cause of open command failures is a nonexistent file path.
To finish the script, you need to add code that drills down through the targeted directory tree so that the script tests each file in that tree. Instead of writing this code, you can download and use the File::Recurse module. You can configure this module to limit the depth of recursion. The module's HTML Help file includes details about how to limit recursion.
The completed script, FileDeletionByDate.pl, is in the Code Library on the Win32 Scripting Journal Web site (http://www.win32scripting.com). This script includes comments to help you understand the code. You can also find an example log file, log.txt, in the Code Library.
Here are the steps to get FileDeletionByDate.pl working.
Download and install Perl for Win32 on the NT workstation or server on which you'll run the script. Perl for Win32 is part of ActivePerl, which you can download from ActiveState at http://www.activestate.com. ActivePerl installs easily from a self-extracting executable.
Download and install the File::Recurse module. This module is part of File::Tools by Aaron Sherman, which is available from ActiveState. You can use the Perl Package Manager (PPM) to install this and other useful Perl modules. See the perl/html/ppm.html Help file that installs when you install Perl on your machine.
Configure the threshold for the file's modification age.
Configure the local or remote location of the drive or share from which you want to delete the old files. Use double backslashes.
Configure the log file's location.
Thoroughly test the script before using it in a production environment. You must be especially cautious when using the unlink command because of the potential consequences of accidental deletions.
Schedule the script to run as required.
I tested this script on Windows 2000 Professional (Win2K Pro); Windows 2000 Advanced Server (Win2K AS), Release Candidate 2 (RC2); and NT 4.0 servers and workstations. All these machines were using ActivePerl build 522. The script supports spaces in file and folder names. However, the script might not work on Unicode file and folder names. This situation will likely change with Perl 5.6.0.
About the Author
You May Also Like