by: Helge, published: Oct 2, 2013, updated: Oct 29, 2013, in

The Unofficial Atlantis Ilio FAQ

This is a collection of answers to questions around Atlantis Ilio you are likely to have when using the appliance in production.

How Can I Get Detailed Load Information?

If you want to find out what an Ilio appliance really is doing and the information displayed in Ilio Center is not detailed enough, issue the following on the command line:

dstat -D sdb -cdln --disk-util

This displays detailed load information every second until you cancel with CTRL+C. The output of dstat looks like this:

Atlantis Ilio - dstat - system and disk load

The topmost line (dark blue, hard to read) displays the categories: CPU usage, (data) disk reads and writes, system load averages (1, 5 and 15 minutes), network throughput and data disk utilization in percent.

Things to watch: CPU and data disk utilization utilization are the most important counters. Neither should be near 100 % for a prolonged period of time if you prefer response times to be good.

The 1, 5 and 15 minute system load averages give you an idea of what has been going on in the recent past. A value of 1.0 per CPU core is considered full load. The system load is also displayed in Ilio Center (see below).

What is the System Load Displayed in Ilio Center?

This value shows the standard Linux system load per core in percent (more information on system load). So everything above 70 is high and values above 100 mean the system is over capacity.

I have witnessned values as high as 1200, meaning the system is 12 times over capacity. Such high values can be the result of multiple VMs performing full antivirus scans concurrently.

What kind of IO reduction can I expect?

In my experience with persistent Windows 7 x64 VMs on Ilio 3.2 the number of IOs is reduced by roughly 50%. That means that if your disk array is capable of handling 1,200 write IOPS (which would be typical for 8 15K drives in a stripe set), Ilio will give you 2,400 IOPS. That is a nice increase, but it does not nearly enough to make your VMs feel as if running from SSDs.

How much RAM does an Ilio instance need?

Just as any other server, Ilio cannot have too much RAM. So better be safe than sorry and round up generously. Example: we configured Ilio instances with 800 GB of disk space and 64 GB of RAM. Each Ilio applicance is hosting 35 persistent VMs with a 100 GB virtual disk each. I have never seen an instance use more than 50 GB, but, hey, RAM is cheap and you do not want Ilio to run out of what it uses as cache, do you?

Is There a Comfortable Way to Edit Configuration Files?

The Linux command line is not for everybody. If you prefer a GUI tool for working with Ilio’s file system, use WinSCP to connect remotely from Windows machines. To configure a connection in WinSCP just make sure to select the SCP protocol.

Data disk usage keeps increasing – how can I reclaim disk space?

Ilio works best when large parts of each VM’s data are identical – through deduplication the identical blocks can be reduced to next to nothing. However, in the real world users (and machines) tend to generate a fair amount of individual data that cannot be deduplicated. Such data is dangerous because it makes your overall disk usage grow quickly.

Luckily, at least part of the individual data is deleted after a while, e.g. when a user logs off or when a machine is rebooted. However, to free the deleted data from Ilio’s data store, an additional step is required: the contents of all deleted files on disk must be replaced with zeroes in order for Ilio to be able to reclaim the disk space. An easy way of doing that is to regularly run the Sysinternals sdelete tool, either as a startup script or a scheduled task. Run it with the following parameters:

sdelete -accepteula -s -z c:\

Note that sdelete may run for a long time and generate a considerable amount of disk activity, especially on the first run.

Appliance State is “degraded” in Ilio Center Because of a Disk Usage Warning for the OS Disk

Ilio center issues a health warning for Ilio appliances where the used OS disk space is above 1800 MB (out of a total of 2408 MB). The growth in disk space is caused by log files, notably /var/log/milio.log, which I have seen grow to 400 MB. It would probably have become even larger without intervention.

There are three things you can do:

  1. Delete the file
  2. Configure log rotation

Obviously the latter is preferable as it solves the problem once and for all. Atlantis saw this, too, as they have enabled log rotation in Ilio 4.1.

To configure log rotation create the file /etc/logrotate.d/milio.log with the following content:

/var/log/milio.log {
size 10M
rotate 10
compress
delaycompress
}

This instructs the logrotate daemon to rotate milio.log once it reaches a size of 10 MB and to keep the last 10 files as compressed files.

Logrotate is run hourly. Wait for an hour, then check if milio.log has been rotated. Due to the delaycompress parameter compression is performed one iteration after a file has been rotated (when the next file is rotated). This is useful when a program cannot be told to immediately close it’s logfile. This is just to be on the safe side.

Is it Possible to Resize an Ilio Data Disk?

No, resizing disk-backed Ilio is not possible without losing all data. Make sure to size properly prior to putting Ilio in production. If you do need to resize, you have to move all VMs off the machine, which can be a lengthy process when using local storage without vMotion.

Cannot Connect to Ilio Center

Ilio Center may stop responding so that its web interface is not available any more. In that case restart the Tomcat service from the console:

/etc/init.d/tomcat6 restart

How Do I Upgrade Ilio to a Newer Version?

This is explained in the admin guides available from Atlantis. For completeness sake here are the major steps:

  1. Upgrade Ilio Center.
  2. From Ilio Center, upgrade the agents in the Ilio instances (can be done without downtime).
  3. From Ilio Center, upgrade the Ilio instances (requires a reboot -> downtime!).

One error I ran into was that an Ilio Center I had upgraded from 3.x to 4.0 and then to 4.1 failed to update the Ilio instances (step three from the list above). It did not give a specific error, either, only “upgrade failed”. To resolve the issue I deleted the Ilio Center VM and recreated it from the template. Ugrading the Ilio instances worked from that fresh Ilio Center.

Previous Article uberAgent Now Monitors RES Workspace Manager Logon Times, Too
Next Article How to Process Terabytes - per Day (or: my account of Splunk .conf 2013)