Comparing the CPU Performance of Physical and Virtual PCs (VDI)

When you move users from a physical PC to a VDI environment you may find that they are not too happy with their new machine’s performance – it happened to me. To quantify things I took a series of measurements comparing the old PCs we migrated away from with both VDI machines and the new PCs available to some.

Test Procedure

There is no objective CPU benchmark. Depending on the tool you run you will get widely differing results. For that reason I used four very different benchmarking tools. I chose these four for no other reasons than that they were readily available and different in nature:

  • NovaBench 3.0.4 CPU test: Synthetic benchmark tool.
  • CineBench 11.5 CPU: 3D rendering tool turned into a benchmark.
  • SunSpider 1.0 in IE8: Measures JavaScript performance.
  • Super Pi Mod 1.8 WP 1M: Calculates Pi.

Some tools are multi-threaded and take advantage of more than one CPU core while others do not. That is just like in reality. Single-thread performance is still extremely important today, which is why Intel enabled their CPUs to optionally run only one thread at a high speed instead of multiple threads at normal speed (“Turbo Boost”).

All measurements were run twice. The result shown below is the average.

HyperThreading was enabled where available.

The virtualization hosts running the VDI machines were oversubscribed in terms of virtual to physical CPU allocation. The system with the Xeon E5-2670 had an oversubscription ratio of 4:1, the system with the Xeon X5690 had 6:1.

The virtualization hosts’ power profile was set to maximum performance. At the default setting (“balanced”) CPU performance is noticeably worse.

The virtualization hosts were running VMware ESXi 5.0.

Test Results

Benchmark Old PC
Core2 Duo E4600
2.4 GHz
2 cores, no HT
New laptop
Core i7-3520M
2.9 GHz
2 cores + HT
VDI – Westmere
Xeon X5690
3.47 GHz
2 vCPUs
VDI – Westmere
Xeon X5690
3.47 GHz
4 vCPUs
VDI – Sandy Bridge
Xeon E5-2670
2.6 GHz
2 vCPUs
VDI – Sandy Bridge
Xeon E5-2670
2.6 GHz
4 vCPUs
NovaBench
>> more is better
226 455 283 451 261 417
Cinebench
>> more is better
1.22 3.33 2.14 4.28 2.20 4.42
SunSpider
<< less is better
5726 2920 3301 3256 3865 3760
Super Pi
<< less is better
25.9 10.9 12.6 12.2 13.2 13.1

Analysis

We can learn several interesting things from these tests. These conclusions are, of course, only applicable to similar scenarios, so things may be different for you:

VDI vs. old PC: A VDI machine running on Intel’s newest server CPU is not that much faster than a physical PC with a 5.5 year old CPU.

VDI vs. new PC: Even a laptop CPU outperforms the Xeon in nearly all tests. Given that desktop PC CPUs are faster than laptop CPUs it is safe to assume that a VDI machine does not stand a chance against a PC.

Single-thread performance on VDI: As the SunSpider and Super Pi tests show, single-thread performance on a VDI machine is not great.

Xeon Westmere vs. Sandy Bridge: Probably due to its much higher clock speed the older Westmere-EP CPU performs better in most tests than the newer Sandy Bridge-EP CPU (the fastest Sandy Bridge-EP, the E5-2690, might be on par or even better, though).

Closing Thoughts

When moving power users (aka knowledge workers) from a PC to a VDI machine CPU performance is a topic that needs as much attention as IO performance. In your project, do not rely on benchmarks alone – those are always synthetic and may or may not match what you see in reality – but have real humans test the applications they use in the way they use them. I have seen “harmless” Excel sheets turn out to be massive CPU hogs running for hours or even days. Differences in performance were noticed immediately by the users.

, , ,

5 Responses to Comparing the CPU Performance of Physical and Virtual PCs (VDI)

  1. Steve Greenberg July 17, 2013 at 23:04 #

    Helge,

    What an elegant way to analyze this! This has alway been a topic of discussion in our consulting group that our experience tells us that we lose some CPU power in virtual environments. This simple set of tests shows nicely what is going on.

    However, also consider that the loss in CPU power, from the end user experience point of view, may be offest by the fact that the server is generally attached to higher performing storage, and, is inside the datacenter and attached at higher speeds to file servers, app servers ,etc.

    Thanks

    Steve

    • Helge Klein July 17, 2013 at 23:19 #

      Thanks for your kind words, Steve!

      You are right, of course, in that CPU power is only one variable in the equation. It’s just that I write about one thing at a time ;-)

      Helge

  2. Gabrie van Zanten July 18, 2013 at 09:11 #

    Hi
    When reading your results I was a bit surprised, since we usually notice much better performance for the user. Can you clarify how you tested? When running these benchmark tools, you are testing the maximum performance of the VDI desktop which is an unrealistic use-case. Did you run the benchmark in all VMs at the same time on the host? Because that is where you would hit an unrealistic scenario and that is where a 1:4 or 1:6 oversubscription is not viable. Virtualisation is build on the fact that you profit from not all systems using all resources at the same time. Wouldn’t it be better to use a tool like LoginConsultants VSI Benchmark to compare performance?

    Gabrie

    • Helge Klein July 18, 2013 at 11:23 #

      I ran the benchmarks on production systems, albeit only in one VM at a time. That means that the host was “normally” loaded with about 30-35 VDI machines (Win7 x64) during each test. I repeated each benchmark several times and also ran it on different VMs and different hosts. The results were pretty much identical, there is not much variation.
      I did not use LogonVSI for this because I wanted to compare CPU performance only.

  3. Michael July 22, 2014 at 08:48 #

    Can I suggest you try running Super PI inside the guest with a forced CPU affinity to CPU0?

    I’ve noted some serious performance hits in 7z for example. Setting affinity to a single core general increases performance by 33%

Leave a Reply