by: Helge, published: Dec 16, 2009, updated: Oct 24, 2010, in

The 30 Second File Copy Bug, or: EFS = Bad Performance?

This article describes a bug that dramatically reduces file copy performance on Windows systems. I cannot provide a fix (not having access to the Windows source code), but I have found a workaround.

Situation

I have been using a simple backup system for years: I regularly copy new and changed files with a robocopy-like mechanism from my main computer (a laptop) to my home PC where I archive my stuff. This has been working very well until a couple of months ago.

As I found out, it is vital to this case that I encrypt all data on my laptop with EFS, but do not use EFS encryption on my home system (side note: never, ever encrypt your backups).

So, what happened?

Some time ago, file copy performance from the laptop to the home PC suddenly became a nightmare. I am not talking of megabits instead of gigabits, I am talking of bits. Copy throughput dropped to around 50 Bytes/s on average. Hardly acceptable. So I set out to investigate.

Investigation

I quickly found out the following:

  • Performance is OK when copying from a different laptop to the home PC
  • Hotfix KB973554 for the home PC does not help (the home PC runs Vista, the laptop Windows 7)
  • Disabling SMB 2.0 on the target (Vista) box does not only not help, it makes SMB communication impossible
    (I disabled SMB2 by setting this registry value: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\SMB2=0 [DWORD])

Finally I made an important discovery: This problem affects only files that are EFS-encrypted on the source computer. Copying unencrypted files worked like a charm. I found this out rather late in the troubleshooting process, unfortunately, because practically all files I deal with are encrypted.

Workaround

After I had identified the cause, a workaround was quickly found: I disabled EFS on the home PC and performance went through the roof:

fsutil behavior set disableencryption 1

Details

When did this start? After all, I have been using EFS on my laptop for years with the operating systems XP, Vista, Windows 7 beta, RC and RTM. My Vista home PC has not changed much recently, except for the mandatory OS patches. I can think of either one of the following two “incidents” that may have started the trouble. Why these? Because they happened roughly at the same time performance went south:

  • The installation of Windows 7 RTM. As I mentioned, I used Windows 7 RC on the laptop before.
  • The domain join of my laptop to my employer’s AD domain shortly thereafter. Before, I had been enjoying the freedom of workgroup membership.

What actually happened? If something is so slow, network timeouts are a likely cause. Here are two screen shots of network traces I collected (ironically only after I had found the workaround):

I issued the copy command (a simple “xcopy /G file.txt \\192.168.0.2\platted\temp\” – /G meaning copy even if the encryption is lost) at around 2 seconds into the trace. As you can see, at around 32 seconds it is still not finished. Why? The copy target is looking for domain controllers in its workgroup! It does that six times, waiting for a second in between. Of course, it gets no answer (lacking a DC), so it repeats the cycle five times. That amounts to 30 seconds, pretty much the length of the copy operation.

What if I copy multiple files? Each single file takes around 30 seconds to copy!

This second screen shot shows that after five tries the source computer gives up and sets the encryption flag to zero (=unencrypted). The copy then happens in a snap, as expected.

Who is the culprit? Windows 7 (the source) setting the encryption flag or Windows Vista (the target) looking for a domain controller in its workgroup? I do not know, but I suspect the latter.

By the way: the file I copied when recording these network traces had a size of 36 bytes.

Next Steps?

If I was Mark Russinovich, I would walk over to my colleagues who programmed the affected OS component and have a nice friendly chat with them. As it is, I can only hope someone reads this who has the means to get a fix developed. In the meantime I can live very well with EFS disabled on my home machine.

Previous Article VXI: How a Customer With Real Scalability Problems Uses Virtualization
Next Article How to Analyze Kernel Performance Bottlenecks (and Find that ATI's Catalyst Drivers Cause 50% CPU Utilization)