Four Ways to Increase the Capacity of Your Citrix XenApp Farm

Even with the most meticulous design, the day will come when your farm’s capacity is not sufficient any more. User numbers increase, applications become more resource-hungry and the amount of data to be handled increases steadily. So what do you do? Simply more of the same, i.e. buy more servers and add them to the farm? That is one way of increasing capacity, but it is not the only one and therefore may not be the best.

In this article I am assuming that your farm actually is near 100% capacity (with a safety margin, of course). Only then does the following discussion make sense. It may very well be that a farm appears to be fully loaded, but in reality the terminal servers are having a good time idling around. In such cases some other component slows things down. The list of potential sources of trouble is long and includes the network, infrastructure servers (Citrix data store, domain controllers, file servers, application servers, etc.) and even misconfiguration of the virus scanners. Before going out and spending money on new hardware or software, it might be a good idea to have an external consultant review your farm design to make sure you are spending your company’s money on the right things.

Identify the Capacity Bottleneck

Once you are sure that you actually have a capacity bottleneck, the obvious first step is to identify it. Just knowing “I cannot get more than x users on each of my servers” is not going to be enough.

A XenApp farm is a very complex thing. Finding a bottleneck may at first sound like looking for a needle in a haystack. So let’s simplify. Every terminal server consists of the following main components:

  • Memory
  • CPUs
  • Hard disks
  • Network interfaces

Today’s (dual) 1 GBit/s network cards are fast enough for typical XenApp servers. Hard disks are not used much, except for storing temporary and user profile data, and large parts of the profiles are redirected to a file server anyway. CPU management in XenApp works well. That leaves us with memory as the culprit. How fast was that for performance analysis?

I must admit I lied to you. Finding a capacity bottleneck may very well be a really tough job. Every farm is different. No two XenApp customers have the same application set. But still, if we are mostly looking at typical office or task workers and taking a typical network and farm design as a basis, it mostly boils down to memory being short.

There are two kinds of memory shortage on 32-bit systems. The first is exhaustion of kernel memory (see my earlier article on the subject for details). The second simply is lack of enough RAM to satisfy the running applications’ demands.

Depending on whether your servers reach the limit of their kernel or application memory pool, one or more of the following scaling options may apply to your situation.

Option 1: Scale out by Adding Servers

As mentioned above, this may seem like the logical solution if every server has reached its capacity limit. The greatest advantage of this approach is that you need not change the way you install and manage your servers. You use the same OS and the same management tools, just on a larger number of machines. The downside is that each additional server reduces your data center’s efficiency in terms of power consumption and cooling. Also, rack space is increasingly becoming a scarce resource in many organizations.

Option 2: Scale up by Adding Memory

This also sounds obvious. If more memory is needed, why, put more RAM into the machines! In practice, however, it is not so simple. In order to use more than 4 GB of RAM, you need to move from Windows Server 200x Standard Edition to Enterprise Edition, which is sold at roughly four times the Standard Edition’s price. I am told the list price difference is something around 2,000 dollars. Multiply that by your number of servers, and you may quickly come to the conclusion that other options may be more economical. Additionally, adding RAM only helps if your servers are not running out of kernel memory. Still, going from 4 GB to 8 GB RAM per server might be sensible in certain scenarios (see my earlier article on the subject for details).

Option 3: Scale up by Moving to Windows x64

64-bit Windows essentially does away with kernel memory limitations. Install as much RAM as your systems support, and the OS will use it. So is this the perfect solution? In theory, yes. In practice, you need to consider that migrating to Windows x64 is far from trivial (see my article series on the subject for details). And do not forget that you need more RAM. Much more RAM. 20-50% more only to have the system work as fast as it did with 32-bit Windows. But then it scales much better. Double or triple the amount of RAM, and you should be able to significantly increase the number of users per machine.

Option 4: Scale up and out by Virtualizing XenApp

Today’s common terminal servers with 4 GB of RAM and Windows Server Standard Edition are easy to manage because everything is well-known and OS licenses are inexpensive. Their main downside is server sprawl, the sheer number of machines required to cope with the workload. From an administration point of view that does not really matter – does it make a difference if you are managing 100, 200 or 500 machines? But server sprawl is an ecological and thus economical nightmare.

The answer to server sprawl may be virtualization. By putting multiple virtual XenApp servers on each physical machine, overall power consumption and rack space can be reduced significantly. And you get two things for free: increased flexibility (just think of VMotion/Live Migration/XenMotion) and increased complexity (another layer to manage).

Now What?

I have presented four options to choose from not only to increase farm capacity but also to improve the general layout of your XenApp servers. But the toughest part is yours – deciding which best fits your organizations needs.

, , , , , , , , ,

8 Responses to Four Ways to Increase the Capacity of Your Citrix XenApp Farm

  1. Guy Levenshulme June 26, 2009 at 08:40 #

    Rather than add memory, just install AppSense Performance Manager – its physical memory control feature increased our user density by 42% on average across the farm!

    • Helge Klein June 26, 2009 at 08:48 #

      It might be a lot cheaper to just add memory, though ;-)

  2. Guy Levenshulme June 26, 2009 at 10:21 #

    We considered going with more memory Helge, however we chose not to do so for two reasons;

    1 – we found more memory did help with capacity somewhat, but quality of service suffered and performance response times slowed to unacceptable levels due to intense CPU and Disk thrashing.
    2 – we have many servers, and the time and logistics in taking the servers down to see what DIMMs it had and adding memory (if possible) would have cost much more than just the cost of the HP memory itself, particularly when we already had 4GB in most of our 32 bit servers.

    So, appreciate the suggestion, but memory simple wasn’t a viable option for us. Sure, it may work for others in different cases – but AppSense was a life saver for our XenApp farm and we don’t deploy a server without it now.

    • Helge Klein July 16, 2009 at 11:00 #

      1) I cannot see why adding memory would lead to unacceptable response times due to intense CPU and disk “trashing”.

      2) So, powering down servers to add memory is more expensive than to install (and reboot?!) an extensive software suite on each machine that requires full regression testing?

      All in all, your comments sound a little biased. Adding servers and/or memory to the farm is in my experience often cheaper and easier to manage.

  3. Malcolm October 27, 2009 at 21:42 #

    Your articles are very good, however, I am still trying to figure out our problem here. We have 2 Presentation servers 4.5 and we recently upped our memory to 32gb from 8, with a new accounting package being published. We still get the dreaded “cannot load your profile” errors. we are running out of resourses after about 3 days from a reboot. Windows 2003 enterprise, 32 gb ram. we get the error around 35 users per server.

    Thanks much!

  4. Malcolm October 30, 2009 at 17:24 #

    would using the /PAE help?

Leave a Reply