Hard Links, Soft (Symbolic) Links and Junctions in NTFS: What Are They (For)?
This is an attempt at demystification. In the Windows world, links in the file system are often regarded as obscure, except for the infamous .LNK files, of course. But file system links are neither freaky UNIX/Linux command line stuff, nor are they new: Microsoft’s OS offers two types of links since Windows 2000 and a third type since Vista/Server 2008. And boy, can they come in handy!
Hard links are everywhere. Every single file is a hard link! Think of it as this: a file really consists of two logically separate parts. The actual data (e.g. the contents of an MP3 file, the actual music) and a directory entry in the master file table (MFT) pointing to the data. MP3 music has no file name. That sounds silly, but it is the simple truth. In order to be able to have a name for a file of MP3 music a separate entity is needed, which is not related to the music in any way. That “entity” is your hard link.
Looking at it that way, hard links are easy to understand. Data is stored in clusters on your hard disks. The master file table stores links to the data and puts names on the links. That gives us file names. But are we limited to one MFT entry per data set (file)? No! NTFS has no problems whatsoever with multiple entries all pointing to the same clusters and thus the same data. Having two names for the same MP3 file is perfectly all right with NTFS, like C:\Fun\Boss talking bullshit.mp3 and C:\Work\Important speech of the boss.mp3. By clicking on either of those files the same speech is played, of course.
Since hard links are implemented directly in the MFT, they are limited to one volume. You cannot create a hard link that points to another volume, partition, drive or even to a file server on the network. But hard links are completely opaque to applications – very much in contrast to .LNK file that are only used and resolved by Explorer. Applications do not even know they are accessing some data via a hard link, how could they? Since every file is a hard link, it is only possible to determine how many hard links exist per file. There is no “first” or “real” one. Every single MFT entry is just one hard link among, potentially, many.
The downside of this “opaqueness” is that counting the size of directories becomes difficult. Determining a folder’s size by looking at its properties in Explorer does not always yield the real size on disk since multiple hard links count twice! I do not know of any solution to this problem. If you have one, please let me know.
Junctions are counterparts to hard links in that they work on directories instead of files. Implemented as reparse points stored as metadata in the file system junctions can point to other directories or volumes on the same computer, but not to folders on other computers. Unlike hard links, junctions point to a fixed path, the target. If the target is moved, deleted or renamed, you get the error “File not found” when attempting to list the contents of a junction. That means junctions can become stale, while hard links cannot.
Soft or Symbolic Links
While hard links and junctions have been present since Windows 2000, symbolic links were only recently added with Vista and Server 2008. They are similar in nature to junctions, but can also point to files and even to remote systems on the network, provided that the target machine runs Vista or later, too. As with junctions, changing a link’s target results in a stale link. There is no mechanism built in that notifies the source about target changes.
By some, junctions are also regarded as soft links. Although that is technically correct, I prefer to distinguish between junctions and “real” soft aka symbolic links for practical reasons.
Link Creation and Manipulation
From the command line, hard links can be created with fsutil hardlink create (2000 to XP) or mklink /h (Vista and newer). For programmers, the API function CreateHardLink has been available since Windows 2000.
Junctions are best manipulated with the Sysinternals tool of the same name. Programmatic creation is, to my knowledge, undocumented.
Notes and References
All this applies to NTFS partitions only, of course.
Examples and practical tips of when to use which kind of link are outside the scope of this article, but a good topic for a future post.