Scripting Shortcuts That Contain Unicode
How to work around WshShortcut's Unicode illiteracy
October 11, 2010
If your organization uses Unicode filenames, scripting shortcuts with Windows Script Host's (WSH's) WshShortcut object can be an exercise in frustration. What's particularly irritating is that this object allows you to write Unicode text to any of its properties without any fuss, but when you attempt to save the shortcut, WshShortcut either throws an error (if the shortcut's filename or path contains Unicode) or silently mangles the Unicode into garbage (if any other property contains Unicode).
Adding one final insult, WshShortcut returns the same generic error message—unable to save file —for any refusal sent back by the Windows file system. The Unicode characters in the file path will be replaced with a question mark (?), making the error message even more confusing.
Fortunately, there are ways to work around these problems, which I'll explain. First, though, I'll shed some light on why WshShortcut exhibits these annoying behaviors.
Why WshShortcut Is Unicode Illiterate
The reason why WshShortcut accepts Unicode is easy to explain: It has to. Quoting from Microsoft's own COM interface design rules: "All string parameters in interface methods must be Unicode" \[http://msdn.microsoft.com/en-us/library/ms692709(VS.85).aspx\]. In WSH, all text written to a file (or another type of object) is represented internally as Unicode. (WSH and its scripting languages always use Unicode internally.) This pushes string conversion problems down to the level of the components that produce the output in a particular representation.
So why does WshShortcut throw errors and mangle content even though you're allowed to use Unicode in shortcuts? There's no official explanation for this, but it makes sense if you understand the historical context of WSH.
The WSH helper objects, including WshShell and its member objects such as WshShortcut, were designed when Windows 98 and Windows 95 (both singularly lacking in Unicode file-system support) dominated the desktop. At that time, three simplifications were made:
The designers wanted the WshShortcut object to function on all supported platforms. The easiest way for the same codebase to work identically on all systems is to use non-Unicode APIs, so that's what WshShortcut uses.
Instead of inspecting the text it's given, WshShortcut simply assumes that it's ANSI text represented in Unicode. If you supply ANSI text to the object, there's no problem. When ANSI text is represented in Unicode, the first byte in each pair of bytes is empty, and WshShortcut's technique of chopping the first byte of each byte pair is no problem. However, when you supply Unicode characters, you now only have the representation of the last half of each character. This is essentially like chopping off the first half of the numbers in a street address. The address 8922 North Main becomes 22 North Main, which is not only wrong, but also might not even correspond to a real address. Thus, when text that needs Unicode representation gets treated in this manner, you get garbage.
WshShortcut simply generates a binary shortcut file as raw bytes that will be written to disk. There are no checks to determine whether the data written is nonsense or will cause problems. In the case of the shortcut's pathname, the file-system APIs reject most nontext characters—and a mangled Unicode filename is highly likely to include some illegal characters.
At this point, we're stuck with WshShortcut' shortcomings. Any chance of WshShortcut being updated to support Unicode vanished when WSH was frozen in 2001. However, as I mentioned previously, there are several workarounds for making Unicode-friendly shortcuts.
Create Once, Copy Many
If you create a shortcut by right-clicking and dragging an object (or by right-clicking in a folder and selecting New Shortcut from the context menu), you'll have no problem including Unicode content in the shortcut's pathname or other properties. If you're creating shortcuts for users on a standardized network (i.e., one that has similar machines, OSs, software, and user setups), you can do this once, put the shortcut on the network (preferably with its read-only attribute set to minimize accidents), then copy it to desktops using a script. Batch files, PowerShell scripts, and scripts using WSH's Scripting.FileSystemObject all support Unicode pathnames, so you don't have any significant restrictions if you use this technique.
Shell Links
The Shell.Application COM object understands both Unicode and Windows shortcuts, which it calls Shell links. In spite of this, scripters rarely use Shell links because the syntax is a bit cumbersome and you can't create a shortcut using the scriptable Shell.Application API. However, you can create a shortcut with a tool such as WshShortcut and open it with Shell.Application. You can then use Unicode in any of the shortcut's properties.
I'll show you two ways you can use WshShortcut and Shell.Application to automate the creation of shortcuts that include Unicode. The first technique demonstrates how to use them to create a shortcut with a Unicode filename. The second technique demonstrates how to use them to create a shortcut that has Unicode content in several other properties (e.g., description, target path) as well.
Shell Link Technique 1
If you need to create a shortcut that has a Unicode filename, you can create the shortcut with WshShortcut using an ANSI filename, open it with Shell.Application, then rename the shortcut. Here's a simple demonstration.
Suppose you need to create a shortcut whose filename needs to be in CJK script (core symbols common to the Chinese, Japanese, and Korean languages) on users desktops. You can use a script like UnicodeShortcut1.vbs in Listing 1. Note that this script is displayed in an editor that supports Unicode. The Unicode content can get mangled if the script is opened in an editor that doesn't support Unicode. I use SAPIEN Technologies' free PrimalPad editor. Notepad also supports Unicode.
UnicodeShortcut1.vbs begins by creating the objects it'll use. The script then determines the path to the desktop folder in line 3. In line 4, the script stores the Unicode text you want to eventually use as the filename and description.
Because using a Unicode filename like the one in line 7 (which is commented out) would fail, you need to use an alternate filename that has standard ANSI characters, such the name bpmfName in line 8 (b, p, m, and f are the first four letters in the CJK script system, which is commonly called Bopomofo).
After the shortcut is saved in line 17, it becomes a file. All of the FileSystemObject APIs support Unicode, so you can easily rename the file, as shown in lines 19 through 21. Figure 1 shows the result.
The only thing this technique doesn't do is give you a way to include Unicode content in other shortcut properties. For that, you need to use technique 2.
Shell Link Technique 2
UnicodeShortcut2.vbs in Listing 2 demonstrates the technique for creating shortcuts that include Unicode content in several shortcut properties. I'll walk you through the script so you can see how the technique works.
The script begins by setting four variables:
TargetPath, which contains the pathname of the application or file you're creating the shortcut for
Description, which contains the text that pops up when the mouse pointer hovers over the shortcut
WorkingDirectory, which contains the working directory for the application or file when it launches
FinalLinkPath, which contains the location where you want the shortcut to be
Next, the script initializes the objects it'll use (lines 6 through 8) and creates an ANSI path to a location where it can save a shortcut template (lines 12 and 13). This is a template because it has the form of a shortcut, but is completely empty. The sole reason for creating it is that the Shell.Application object we'll be using for the rest of our work can't create a shortcut file itself; Shell.Application can only modify a pre-existing shortcut.
To create the shortcut template, the script calls the WshShell object's CreateShortcut method with the safe path as its argument. It then immediately calls the returned WshShortcut object's Save method to save it. The script doesn't need to write anything to the shortcut file because WshShortcut doesn't validate content. In fact, it doesn't even give the WshShortcut object a name since it doesn't need one.
In line 19, the script uses the Shell.Application reference in the ShApp variable to get a Shell link object. This line of code starts by connecting to a shell folder's namespace. In this case, the shell folder is the user's desktop folder (indicated by the identifier 0), but it can be any shell folder. The code then calls the shell folder's ParseName method with the pathname to the temporary shortcut as its argument, which provides a connection to that file. The code then retrieves the file as a Shell link by way of the GetLink property.
In lines 22 through 25, the script fills in the shortcut's details and saves the Shell link using the pathname that contains Unicode. Finally, it deletes the shortcut template. Although it takes multiple object references, the entire process can be automated.
The Best Approach for You
The most efficient approach for handling shortcuts that contain Unicode depends on your situation. If you have a standardized network, manually creating a shortcut, then using a script to copy it is probably going to be the simplest solution. This is particularly true if you have only a couple of shortcuts to deploy.
If you don't have a standardized network and you only need to use Unicode in the shortcut's pathname, the first Shell link technique I showed you is probably best. It requires less scripting and customization than the second Shell link technique.
The second Shell link technique is most useful when you need to use Unicode in other shortcut properties, such as its description or working directory. This technique requires the most scripting know-how.
You can download UnicodeShortcut1.vbs and UnicodeShortcut2.vbs by clicking the Download the Code Here button near the top of the page. As I mentioned previously, you need to use an editor that supports Unicode (e.g., PrimalPad, Notepad) to open these scripts.
If you're using an editor that doesn't support Unicode, I included alternate versions of these scripts in which the Unicode characters are entered in escaped form. AlternateUnicodeShortcut1.vbs and AlternateUnicodeShortcut2.vbs work exactly like their counterparts UnicodeShortcut1.vbs and UnicodeShortcut2.vbs.
About the Author
You May Also Like