Hard Links, Soft (Symbolic) Links and Junctions in NTFS: What Are They (For)?
This is an attempt at demystification. In the Windows world, links in the file system are often regarded as obscure, except for the infamous .LNK files, of course. But file system links are neither freaky UNIX/Linux command line stuff, nor are they new: Microsoft’s OS offers two types of links since Windows 2000 and a third type since Vista/Server 2008. And boy, can they come in handy!
Hard Links
Hard links are everywhere. Every single file is a hard link! Think of it as this: a file really consists of two logically separate parts. The actual data (e.g. the contents of an MP3 file, the actual music) and a directory entry in the master file table (MFT) pointing to the data. MP3 music has no file name. That sounds silly, but it is the simple truth. In order to be able to have a name for a file of MP3 music a separate entity is needed, which is not related to the music in any way. That “entity” is your hard link.
Looking at it that way, hard links are easy to understand. Data is stored in clusters on your hard disks. The master file table stores links to the data and puts names on the links. That gives us file names. But are we limited to one MFT entry per data set (file)? No! NTFS has no problems whatsoever with multiple entries all pointing to the same clusters and thus the same data. Having two names for the same MP3 file is perfectly all right with NTFS, like C:\Fun\Boss talking bullshit.mp3
and C:\Work\Important speech of the boss.mp3
. By clicking on either of those files the same speech is played, of course.
Since hard links are implemented directly in the MFT, they are limited to one volume. You cannot create a hard link that points to another volume, partition, drive or even to a file server on the network. But hard links are completely opaque to applications – very much in contrast to .LNK file that are only used and resolved by Explorer. Applications do not even know they are accessing some data via a hard link, how could they? Since every file is a hard link, it is only possible to determine how many hard links exist per file. There is no “first” or “real” one. Every single MFT entry is just one hard link among, potentially, many.
The downside of this “opaqueness” is that counting the size of directories becomes difficult. Determining a folder’s size by looking at its properties in Explorer does not always yield the real size on disk since multiple hard links count twice! I do not know of any solution to this problem. If you have one, please let me know.
Junctions
Junctions are counterparts to hard links in that they work on directories instead of files. Implemented as reparse points stored as metadata in the file system junctions can point to other directories or volumes on the same computer, but not to folders on other computers. Unlike hard links, junctions point to a fixed path, the target. If the target is moved, deleted or renamed, you get the error “File not found” when attempting to list the contents of a junction. That means junctions can become stale, while hard links cannot.
Soft or Symbolic Links
While hard links and junctions have been present since Windows 2000, symbolic links were only recently added with Vista and Server 2008. They are similar in nature to junctions, but can also point to files and even to remote systems on the network, provided that the target machine runs Vista or later, too. As with junctions, changing a link’s target results in a stale link. There is no mechanism built in that notifies the source about target changes.
By some, junctions are also regarded as soft links. Although that is technically correct, I prefer to distinguish between junctions and “real” soft aka symbolic links for practical reasons.
Link Creation and Manipulation
From the command line, hard links can be created with fsutil hardlink create
(2000 to XP) or mklink /h
(Vista and newer). For programmers, the API function CreateHardLink has been available since Windows 2000.
Junctions are best manipulated with the Sysinternals tool of the same name. Programmatic creation is, to my knowledge, undocumented.
Beginning with Vista, symbolic and other links can be manipulated with the mklink command. For programmers, the function CreateSymbolicLink has been added to the Win32 API.
Notes and References
All this applies to NTFS partitions only, of course.
Examples and practical tips of when to use which kind of link are outside the scope of this article, but a good topic for a future post.
The Wikipedia has relatively good articles about hard links, soft/symbolic links and junctions.
MS KB #205524 How to create and manipulate NTFS junction points
MS KB #315688 How to locate and correct disk space problems on NTFS volumes in Windows XP
9 Comments
Thanks for the post – maybe you can clarify something for me. When you have multiple hardlinks to a file does backup software see this as 1 lump of data plus multiple links in the MFT and consequently only backup the ‘data lump’ just once, or does the software see it as multiple separate files and backup the data many times. The later being an obvious inefficient use of space.
drtg,
whether a backup program backs up data that is pointed to by multiple entries in the MFT (i.e. hard links) once or multiple times depends on how clever the backup program is. Although I do not have evidence to back this up (haha) I suppose most programs are rather dumb when it comes to hard links.
I seem to just have accidentally deleted a comment. Sorry! I got confused by the mass of spam comments…
Hi Helge – perhaps it was me?
I was asking how ACLs are evaluated when an object has multiple hardlinks.
Regards
Lee
Hi Lee,
thanks for posting your question again.
I might be wrong with this, but it should be like this:
NTFS ACLs are stored per MFT entry. A hard link basically is an MFT entry. So if you have two hard links pointing to the same data, you can set different permissions.
Hi Helge
Thanks – so presumably that means to create a hardlink, you would need to have ‘full control’ rights probably on the current hardlink. Though it would mean that if later your rights to the original link was tightened-up, you would still have access to the data via your link?
Lee
Lee,
uhh, I have to admit I do not know which permissions you need to create a hard link. But once you have two hard links pointing to the same data, you should be able to set different permissions on each link and thus have different users/groups that are allowed to access the data.
Excellent questions, by the way. It might be a good topic for another post to research and describe permissions in conjunction with hard links. So, thanks for indirectly suggesting the topic ;-)
Many thanks
It’s something that I have been researching for a few days and haven’t found the answers to it. The reason I need to know is that I am developing a system that can have entities located via variuos paths (a la NTFS hardlinks). We need to implement security and I wanted to make it work the same way as NTFS does.
Lee
Lee,
My earlier assumptions about permissions proved to be wrong. Please see this post for details: http://blogs.sepago.de/helge/2009/05/14/hard-links-and-permissions-acls/