Find Files on Local Drives with Whereis.ps1
This PowerShell script supercharges the search capabilities of Get-ChildItem
December 14, 2009
I often find it useful to search for files from a command line. In the past, I've typically used Cmd.exe's dir command with the /b and /s parameters to search for files; combining these parameters provides a list containing the full paths and filenames of matching files. However, dir doesn't have a simple syntax for searching multiple locations. For example, to search drives C and D for all files ending in .doc, you would use this command:
dir /b /s c:*.doc d:*.doc
The syntax gets even more complex when searching for multiple wildcard patterns (e.g., all .doc, .xls, and .ppt files) in multiple locations because you have to type each location and each wildcard pattern separately.
Windows PowerShell's Get-ChildItem cmdlet makes this task simpler. For example, to search drives C and D for all .doc, .xls, and .ppt files, you can use this command:
get-childitem c:*,d:*
-include *.doc,*.xls,*.ppt -recurse
Get-ChildItem's first parameter is a list of paths to search, and the -Include parameter specifies a list of wildcard patterns that qualify the paths. The -Recurse parameter is analogous to the dir command's /s parameter.
Introducing Whereis.ps1
Although the Get-ChildItem cmdlet is quite powerful, I still found myself wanting additional functionality. For example, I wanted to be able to omit the -Recurse parameter and to automatically search local fixed drives if I didn't type a path. Before long, I began writing a full-featured script that augments the Get-ChildItem cmdlet with several additional features. The result is the Whereis.ps1 script, which you can download by clicking the Download the Code Here button at the top of the page. (Note that the Whereis.ps1 script isn't an equivalent to the whereis command you might find on a UNIX-like OS.)
Whereis.ps1 uses the following syntax:
Whereis.ps1 -Name
[-Path ]
[-LastWriteTimeRange ]
[-SizeRange ] [-OneLevel]
[-Files] [-Dirs] [-Force] [-DefaultFormat]
The -Name parameter specifies a wildcard pattern. This parameter's argument can be an array. Files and directories that match the wildcard patterns are included in the script's output. The -Name parameter is the only required command-line parameter. For information about the wildcard patterns you can use, type
get-help about_wildcard
at a PowerShell prompt. Because -Name is a positional parameter, you can omit the parameter name (-Name) and type only its argument if it's the first parameter on the command line after the script name.
The -Path parameter specifies a path, and its argument can be an array. If you don't specify this parameter, Whereis.ps1 searches all local fixed drives. The -Path parameter is also positional, so you can omit the parameter name (-Path) and type only its argument if it's the second parameter on the command line after the script name.
The -LastWriteTimeRange parameter specifies an inclusive date range, and the argument can be an array. Items that have a LastWriteTime property within the range are included in the script's output. If you specify strings for this parameter's argument, Whereis.ps1 attempts to convert them to DateTime objects. If you specify a single date, Whereis.ps1 interprets the date range as "the specified date or later." If you specify an array, Whereis.ps1 interprets the first element in the array as the earlier date boundary and the second element in the array as the later date boundary. You can specify an "older than" date range by using zero as the first element in the array. Table 1 shows some examples for the -LastWriteTimeRange parameter.
The -SizeRange parameter specifies an inclusive size range, in bytes. This parameter's argument can be an array. Files that have a Length property within the range are included in the script's output. If you specify a single number, Whereis.ps1 interprets the range as "files of at least the specified size." If you specify an array, Whereis.ps1 interprets the first element in the array as the smaller size boundary and the second element in the array as the larger size boundary. You can also use PowerShell's numeric multiplier suffixes (kb, mb, and gb) when specifying the arguments for -SizeRange. Table 2 shows some examples for the -SizeRange parameter. Note that the -SizeRange parameter is ignored if you use only the -Dirs parameter (which I describe later), because directories don't have a Length property.
The -OneLevel parameter searches within the specified directories but not their subdirectories. That is, it's the inverse of Get-ChildItem's -Recurse parameter.
The -Files parameter causes Whereis.ps1 to include files in its output, and the -Dirs parameter causes Whereis.ps1 to include directories. The default is -Files. If you want to search for both files and directories, use -Files and -Dirs together. Use -Dirs by itself to search only for directories.
The -Force parameter corresponds to Get-ChildItem's -Force parameter. It causes Whereis.ps1 to search for items with hidden or system attributes.
The -DefaultFormat parameter causes Whereis.ps1 to output file-system objects instead of custom formatted string output. Figure 1 shows an example of Whereis.ps1's custom output, which is easier to read than if you output file-system objects, particularly if you have a large number of results, but you can't use the custom output as input for other scripts or cmdlets that expect file-system objects. The -DefaultFormat parameter helps you avoid this problem.
Inside Whereis.ps1
The param statement at the top of the script defines the script's command-line parameters. I typically use mixed-case variable names for script parameters (and other global variables) in PowerShell scripts, but this is only a convention and isn't required. After the param statement, the script declares the usage, isNumeric, writeItem, and main functions. Whereis.ps1 then calls the main function. Note that in PowerShell scripts, functions must be defined before they're called, which is why Whereis.ps1 doesn't call the main function until the last line in the script.
The usage function outputs a message explaining the script and how to use it, then exits the script. The main function calls the usage function if the -Name parameter is missing from the command line or if the -Help parameter is present.
The main function calls the isNumeric function to ensure that the arguments specified for the -SizeRange parameter are numeric. The isNumeric function works by using the -contains operator to see if its parameter's type is in the list of numeric types (e.g., Decimal, Double).
The writeItem function controls the format of the script's output. If the -DefaultFormat parameter exists on the command line, the writeItem function simply outputs its argument, which is a file-system object; otherwise, the function outputs a formatted string. It uses the standard .NET string formatting codes and the -f operator to produce the formatted string. For more information about string formatting, see the MSDN article "Formatting Types."
The main Function
The main function, which Listing 1 shows, contains the bulk of the script's code. The function's first job is to verify that the -Name parameter is present. If the -Name parameter is missing or if the -Help parameter is present, the main function calls the usage function, which outputs a usage message and ends the script.
The main function next converts the $Name variable into an array; the variable remains unchanged if it already contains an array. The function then uses a for loop to iterate the array. If an array element contains the * wildcard, it replaces the array with the $NULL value. This step is necessary to prevent the Get-ChildItem cmdlet, which runs later in the script, from outputting the contents of subdirectories underneath a directory.
Next, the main function checks to see if the -Path parameter is present. If the -Path parameter is missing, the function uses the Get-WmiObject cmdlet to retrieve a list of local fixed drives, as the code in callout A shows. Therefore, the $Path variable contains either the path or paths specified with the -Path parameter or a list of local fixed drives. The main function then converts the $Path variable into an array; the variable remains unchanged if it already contains an array.
As the code in callout B shows, the function then uses a for loop to iterate the $Path array. For each element in the array, it checks whether the element ends with a backslash (). If it does, the function adds the * wildcard to the path. Then the function checks whether the element ends with *. If it doesn't, the function appends *. When the for loop is complete, each element in the path array ends with *. This process lets us specify a path such as C:Files, and the script interprets the path as C:Files*. This script step not only saves typing when entering paths, but it's also required because the main function uses the Get-ChildItem cmdlet's -Include parameter. See the sidebar "Get-ChildItem's -Include Parameter" for more information about how the -Include parameter works.
The main function next determines if the -LastWriteTimeRange parameter exists. If this parameter doesn't exist, the function creates a two-element array. The function stores the earliest possible date (i.e., 1 January 0001, 00:00:00) in the first element, and it stores the latest possible date (i.e., 31 December 9999, 23:59:59) in the second element. The function gets the earliest and latest possible dates by retrieving the DateTime type's MinValue and MaxValue static properties.
If the -LastWriteTimeRange parameter exists, the main function converts the $LastWriteTimeRange variable into an array; the variable remains unchanged if it already contains an array. If the array contains only one element, the function appends a second element to the array containing the latest possible date. The main function next checks whether the array's first element is zero; if it is, the function uses the earliest possible date as the first element. Then the function attempts to convert both elements of the array into DateTime objects by using the DateTime type's Parse static method, as the code in callout C shows. If the Parse method throws an error, the script block following the trap statement runs, which outputs an error message and halts the script. The function then ensures that the first date is earlier than the second date; if this isn't true, the function throws an error, ending the script.
Next, the main function checks whether the -SizeRange parameter exists. If it doesn't exist, the function creates a two-element array, with a zero as the first element and the maximum value for a 64-bit unsigned integer (UInt64) as the second element. If the -SizeRange parameter exists, the function converts the $SizeRange variable into an array; the variable remains unchanged if it already contains an array. If the array contains only a single element, the function appends a second element to the array with the maximum value of the UInt64 type. The code in callout D shows how the main function then checks to see if both elements contain numeric values by calling the isNumeric function I described earlier. If either element contains a value that isn't numeric, the function throws an error, ending the script. The function also throws an error if the first array element is greater than the second element.
The main function then checks for the nonexistence of the -Files and -Dirs parameters. If neither parameter exists, the function sets $Files to $TRUE. It then sets two counter variables to zero: One to keep track of the number of items found ($count) and the other to accumulate the size of all files ($sizes).
At this stage, the main function has parsed and validated all of the script's parameters, so it executes the Get-ChildItem cmdlet. The function pipes Get-ChildItem's output to the ForEach-Object cmdlet so that it can perform further filtering for each object. If the -Files parameter exists and the object's PsIsContainer property is False (i.e., the object is a file and not a directory), then the main function checks to see if the object's LastWriteTime and Length properties are within the date and size ranges, respectively. If the object's properties are within specified criteria, the main function increments the $count and $sizes variables and calls the writeItem function to output the object. The main function performs similar checks to see if the -Dirs parameter exists and the object's PsIsContainer property is True (i.e., the object is a directory and not a file), except that it doesn't verify the object's size range or increment the $sizes variable because directory objects don't have a Length property.
Sample Commands
Now let's look at some commands that illustrate how to use Whereis.ps1 to perform various tasks. For instance, if you want to search for video and audio files on all local drives, you would use the following command:
whereis.ps1 *.asf,*.avi,*.mov,*.mp3,
*.mp4,*.mpg,*.mpeg,*.qt,*.wav,*.wm,*.wmv
Note that although commands here are shown with line breaks for space, you would enter them all on one line; it's also important that you don't put spaces around the commas.
Next, to search for PowerPoint files that are 10MB or larger in C:Data and its subdirectories, try this command:
whereis.ps1 *.pp[st]*
C:Data -sizerange 10mb
To search for files in C:Data that have been modified within the past 60 days, use
whereis.ps1 * C:Data
-daterange ((get-date)
- (new-timespan -days 60)),(get-date)
-onelevel
If you want to delete all files in C:Logs that were modified 30 days ago or earlier, you would use this command:
whereis.ps1 * c:Logs
-daterange 0,((get-date)
- (new-timespan -days 30))
-onelevel -defaultformat | remove-item
Get-ChildItem on Steroids
PowerShell's Get-ChildItem cmdlet has powerful native functionality, but Whereis.ps1 adds some useful functionality of its own. Add Whereis.ps1 to your toolkit and find what you're looking for even faster. Furthermore, you can build on the scripting concepts demonstrated here to customize and enhance your use of PowerShell cmdlets to suit your personal workload.
Listing 1: Whereis.ps1's main Function
function main { # If -help is present or the -name parameter is missing, output # the usage message. if (($Help) -or (-not $Name)) { usage } # Convert $Name to an array. If any array element contains *, # change the array to $NULL. This is because # get-childitem c:* -include * # recurses to one level even if you don't use -recurse. $Name = @($Name) for ($i = 0; $i -lt $Name.Length; $i++) { if ($Name[$i] -eq "*") { $Name = $NULL break } }
#CALLOUT A
# If no -path parameter, use WMI to collect a list of fixed drives. if (-not $Path) { $Path = get-wmiobject Win32_LogicalDisk -filter DriveType=3 | foreach-object { $_.DeviceID } } #END CALLOUT A # Convert $Path into an array so we can iterate it. $Path = @($Path)
#CALLOUT B
# If a path ends with "", append "*". Then, if it doesn't end with # "*", append "*" so each path in the array ends with "*". for ($i = 0; $i -lt $Path.Length; $i++) { if ($Path[$i].EndsWith("")) { $Path[$i] += "*" } if (-not $Path[$i].EndsWith("*")) { $Path[$i] += "*" } } #END CALLOUT B # If no -LastWriteTimeRange parameter, assume all dates. if (-not $LastWriteTimeRange) { $LastWriteTimeRange = @([DateTime]::MinValue, [DateTime]::MaxValue) } else { # Convert $LastWriteTimeRange to an array (if it's not already). $LastWriteTimeRange = @($LastWriteTimeRange) # If only one element, add max date as second element. if ($LastWriteTimeRange.Length -eq 1) { $LastWriteTimeRange += [DateTime]::MaxValue } # Zero for first element means [DateTime]::MinValue. if ($LastWriteTimeRange[0] -eq 0) { $LastWriteTimeRange[0] = [DateTime]::MinValue }
#CALLOUT C
# Throw an error if [DateTime]::Parse() fails. trap [System.Management.Automation.MethodException] { throw "Error parsing date range. String not recognized as a valid DateTime." } # Parse the first two array elements as DateTimes. for ($i = 0; $i -lt 2; $i++) { $LastWriteTimeRange[$i] = [DateTime]::Parse($LastWriteTimeRange[$i]) } #END CALLOUT C } # Throw an error if the date range is invalid. if ($LastWriteTimeRange[0] -gt $LastWriteTimeRange[1]) { throw "Invalid date range. The first date is greater than the second." } # If no -sizerange parameter, assume all sizes. if (-not $SizeRange) { $SizeRange = @(0, [UInt64]::MaxValue) } else { # Convert $SizeRange to an array (if it's not already). $SizeRange = @($SizeRange) # If no second element, add max value as second element. if ($SizeRange.Length -eq 1) { $SizeRange += [UInt64]::MaxValue } }
#CALLOUT D
# Ensure the elements in the size range are numeric. for ($i = 0; $i -lt 2; $i++) { if (-not (isNumeric $SizeRange[$i])) { throw "Size range must contain numeric value(s)." } } #END CALLOUT D # Throw an error if the size range is invalid. if ($SizeRange[0] -gt $SizeRange[1]) { throw "Invalid size range. The first size is greater than the second." } # If both -files and -dirs are missing, assume -files. if ((-not $Files) -and (-not $Dirs)) { $Files = $TRUE } # Keep track of the number of files and their sizes. $count = $sizes = 0 # Use the get-childitem cmdlet to search the file system, and use # the writeItem function to output matching items. For files, check # the date and size ranges. For directories, only the date range is # meaningful. get-childitem $Path -include $Name -force: $Force -recurse: (-not $OneLevel) | foreach-object { if ($Files -and (-not $_.PsIsContainer)) { if (($_.LastWriteTime -ge $LastWriteTimeRange[0]) -and ($_.LastWriteTime -le $LastWriteTimeRange[1]) -and ($_.Length -ge $SizeRange[0]) -and ($_.Length -le $SizeRange[1])) { $count++ $sizes += $_.Length writeItem $_ } } if ($Dirs -and ($_.PsIsContainer)) { if (($_.LastWriteTime -ge $LastWriteTimeRange[0]) -and ($_.LastWriteTime -le $LastWriteTimeRange[1])) { $count++ writeItem $_ } } } # Output statistics if not using -defaultformat. if (-not $DefaultFormat) { "Found {0:N0} item(s), {1:N0} byte(s)" -f $count, $sizes }}
About the Author
You May Also Like