Why Disabling the Creation of 8.3 DOS File Names Will Not Improve Performance. Or Will It?
It is a common practice amongst administrators to disable the creation of short filenames on NTFS. I freely admit to have recommended this in the past. Was I wrong?
NTFS is relatively relaxed about file names. They can be quite long (255 characters) and may contain “strange” characters (nearly all UNICODE characters are allowed). Today this is taken for granted, but when NTFS was conceived, many applications made assumptions about the length of file names and about the characters allowed in them. Those were either old or badly written applications, but the operating system had to support them nonetheless.
For that best of all reasons, backwards compatibility, NTFS has an interesting ability: it can store multiple names per file. In practice, it stores one file name if the name is valid as both a DOS and Win32 name. If, however, the “real” (Win32) file name is too exotic for the strict 8.3 rule of DOS, a shortened version is stored alongside with it. Whenever a file or directory name gets changed in any way, the short name must again be derived from it and written to disk. It is only logical that many try to avoid this time-consuming operation in the file system. Is it, really?
Beneath the Covers
With NTFS, everything is a file. Even the master file table (MFT), its highly advanced grandchild of MSDOS’s file allocation table (FAT). The MFT is comprised of file records of 1 KB size each. File records in turn contain the standard file properties like date/time, attributes, names and so on. Even a file’s data is stored in the file record if there is little enough of it to fit into what remains of the 1 KB after the standard fields have been written.
Now how does all this relate to the claim made in the article title?
A file record is a tightly packed structure. During the creation of a file, the various fields stored in that structure need to be compiled in memory and written to disk anyway. During a rename, at least the long file name needs to be replaced. This requires at least one disk IO. Whether the amount of data written during that single IO is increased by 12 bytes (the length of an 8.3 file name) simply does not matter.
Although the creation of 8.3 names cannot degrade disk performance, some might argue that there is still increased work for the CPU to do. After all, the short file name needs to be derived from the long name. That certainly is an operation performed by the CPU. It needs to shorten the name and convert it to uppercase. That, however, is an operation so simple that its effects on performance are simply not measureable.
My reasoning above seems to prove conclusively that the additional creation of short file names is not harmful to performance at all. But some pieces of the puzzle are still missing.
The file system needs to make sure that the resulting short name is unique. To ensure uniqueness of the resulting name, it must scan all other elements in the current directory. That sounds like an expensive operation, but it isn’t, at least not if each directory entry has only one name. By organizing directory entries in B+ trees they are kept sorted whilst minimizing disk IOs for name lookups. B+ trees are very efficient, but they are also one-dimensional: to keep both short and long names sorted two trees would be required, yet NTFS employs only one. This creates additional disk IOs during the derivation of short from long names.
File system searches are a second reason why the additional creation of 8.3 names might impact performance. The Win32 API functions FindFirstFile/FindNextFile are used extensively by applications. Again, the B+ trees can significantly speed up searches, but additional directory queries are required when two different namespaces are in use.
Without the cache manager, the effects of short file names on the number of disk IOs and thus system performance would probably be dramatic. But it is important to keep in mind that Windows uses efficient disk caching that greatly reduces the number of times clusters need to be fetched from a physical disk. The MFT is certainly a file that is worth caching.
The one thing that really affects system performance is fetching data from disk. In my analysis I have found cases in which the creation and maintenance of short file names causes additional hard disk IOs. These effects are mitigated by the operating system’s caching mechanisms, though.
The performance benefits of disabling the creation of 8.3 names are probably small. I do not think that is worth breaking backwards compatibility. As always, actual numbers may vary greatly depending on workload, hardware and other factors.
This article contains a theoretical analysis. I have not undertaken practical tests in this matter.
Many web sites describe how to disable the creation of short 8.3 file names by setting the following registry value:
Here is one of them.
KB article 130694 states that maintaining short and long names can adversely affect NTFS performance. But it applies to Windows NT 3.1 and 3.5 only and is kept relatively vague.
Included B+ trees in the analysis and rephrased the conclusion.
to disable the creation of 8.3 file names you can use the “fsutil”. You do not need to edit the registry directly.
fsutil behavior set disable8dot3 1
MVP Directory Services
I think you should include some benchmarks in this article so that we can have some numbers to play with.
“That, however, is an operation so simple that its effects on performance are simply not measureable.”
I don’t think this is true. They programmed it, it can be expressed in code, which can be efficient or inefficient. In any case, I doubt that this is not measurable.
the thing that bugs me about these disabling 8.3 file names in ntfs, is no one ever explains how do you remove the old 8.3 file names already created and stored (in the mft i presume).
is it possible to remove them?
8.3 names are stored in the MFT, as you presume.
How to remove stale/old 8.3 names – that is an interesting question. I have no answer to that (yet).
The only way I know is tedious.
disable 8.3 using fsutil. I would suggest a reboot, but don’t know if it is necessary.
Copy all files onto another volume. DO NOT USE BACKUP.
Erase files from otiginal disk.
Copy them back.
Reenable 8.3 using fsutil
Files added after this point will still have 8.3, but the old ones will not.
This would be a good time for a format if you can think of a reason.
Avoid using 16 bit programs on these files from now on.
At my company we reproduced a scenario in which 8.3 generation -severely- impacted filesystem performance. However, to hit this problem you need 20,000 files in a single folder that all begin with the same first 6 characters. When we got to 300,000 files it took almost 10 minutes just to enumerate the files within the folder. Disabling 8.3 and recreating all those files eliminated the problem entirely (1-2sec to enumerate folder).
We hit one other snag though. It seems Windows 2003 still uses 8.3 folder names when generating the TEMP system variable for a user. %userprofile% is usually C:\Documents And Settings\Username, and %temp% is derived from that (%userprofile%\local settings\temp). The problem with this is that if you look at %temp%, it says “C:\docume~1\username\local settings\temp”.. Worse yet, it seems MS has acknowledged this issue but isn’t fixing it. See link below. The suggested fix is Enable 8.3, or manually change the %temp% variable.
I too would like to remove 8.3 file properties from existing files and folders. When you change the way %temp% is generated to a different variable (say, %userprofile2%) which points to a folder that was made after 8.3 was turned off (say, “C:\User temporary location”), it works properly. So TEMP = “%userprofile2%\local settings\temp” will actually turn into “C:\User Temporary Location\local settings\temp”.
thanks for the interesting information!
I had to recently copy mail server files that held each message in a seperate file from one drive to another. One of the directories contained over 500,000 files. After it reached 100,000, I had to disable 8.3 because it was taking 100% CPU (kernel mode I noticed) and was only copying 1-2 files per second. After disabling 8.3, it takes 5% CPU and is copying over 100 files per second.
So what is the recommended cutoff limit? 20,000? Why didn’t Microsoft just set a threshold. It will create 8.3 filesnames until it hits xxx number of files, then it gives up. Now that would be useful. Eh?
Necroposting from 2019, but if someone stumbles upon this…
To delete the existing 8.3 names after turning the feature off, use the following:
fsutil.exe 8dot3name strip /s /v C:
Also, performance increases vastly any time you have a folder with enough files with identical first 6-8 characters in their file names. Such as when documents are stored by or were created by an old ported mainframe system that uses incremental base-26 for its file names.