Thursday, July 26, 2007

Linux Server Performance Tuning

For the past 2 years, I've had 2 Dell PowerEdge 800 machines running at unacceptably sub-standard performance. They were primarily used as cross compile machines for our embedded Linux and RTOS systems.

For the record , they were bought in mid 2005, and were running
  • 3.2GHz with Hyper-Threading
  • 1GB DDR SDRAM
  • 160GB SATA-1 hard disk
  • Broadcom Gigabit Ethernet
Although by today's standards, 1GB of RAM was not a big deal, but that was quite impressive back in those days.

Anyway, to cut a long story short, I had intended to install RedHat 9 on these machines since this was what the cross compiler suite was developed on, and I knew better than to try than to rock the boat.

Problem #1: RedHat 9 install CD did not support SATA disks and GigaBit cards (there were no such things when RH9 was released). Simple, you'd say - install onto an IDE drive, a compatible Fast Ethernet card, recompile the kernel, remove the compatible hardware, and get back to business! This leads us to ...

Problem #2: Our MIS did not want to crack open the case; they controlled the inventory and didn't want us pesky R&D personnel messing around with the stuff, lest I make off with the RAM or hard disk or something.

So I had to install using the latest Fedora install CD at that time, which I think was Fedora Core 3. I realized that I did not need to install Fedora Core 3 fully. I booted the CD in rescue mode such that the SATA and Gigabit Ethernet were up and running, created and formatted the filesystem on the hard disk, and then copied using NFS, the entire root filesystem of an existing RedHat 9 installation which I had done earlier on an old IBM machine with an IDE drive and a 'normal' Ethernet card. In the process, I had to update e2fsprogs to handle the newer filesystem formatted with Fedora Core 3.

The system seemed to run ok, the kernel moving from 2.4.26 to 2.4.33. In our compilation work, we had to deal with large source trees, including a full Linux kernel source tree, tens of applications, some big, some small, and uClibc. To handle this behemoth (67806 files) of a source
tree, we used SubVersion as the Version Control System instead of CVS.

We hosted the SubVersion repositories on a central server, an older P4 2.4GHz which used to be the only compilation system until we got the PowerEdge 800's.

Running svn update took an average of 5 minutes. Looking back now, I could not imagine how we could have accepted and lived through that kind of performance. But for tens of thousands of files, we thought it was an acceptable performance. Little did I realize how much injustice I was doing Linux!

Among the many things I did to try to improve performance:
  • Installed a Gigabit Ethernet Card on the SVN server.
  • Connected all three machines with a Gigabit switch.
  • Updated the kernel to 2.4.33 (hoping for better disk and network drivers)
  • Installed the latest versions of SubVersion,
  • Installed latest versions of SAMBA and SSH (disable DNS reverse lookups)
The one nagging problem that still remains is the very slow performance when running svn update on the source tree. This is not all, because the whole system slows to a crawl when somebody is doing a compilation. Further more, accessing shared drives via SAMBA would always hang Windows Explorer or the application that is accessing the SAMBA files (like text editors and source analyzers) for half a minute. This is hardly like a state of the art Linux server, which is still considered pretty fast in recent times (3.2GHz).

Last year-end, we bought a new server to take over the role of a source code server in anticipation of the increasing space and computational demands of our firmware development efforts. I had Fedora Core 6 installed on the machine and used LVM to manage the filesystems on the hard disks. (Yes , two hard disks!) I did all the needful, copied RH9 into one of the partitions, did a chroot and used it to compile firmware. On this machine, the performance was surprisingly fast, even though it had a slower CPU (2.8GHz Pentium D). In particular, 'svn update' (which became my definitive benchmarking test) mostly took around 1 second to complete on the same source tree (67k files).

Doing 'strace svn update' shows that the system was hanging on random file accesses. svn was creating, locking and removing thousands of lock files while performing the update command. I copied the FC6 filesystem to one of the PowerEdge 800 machines and booted off that thinking that perhaps the older kernel was really lacking some good SATA driver (2.4.33 had libata 1.2, which 2.6.18 had libata 2.0). To my dismay, nothing changed - same OS, same kernel - different performance. svn update was averaging 40 seconds to complete.

Analysis ruled out the effects of differences in CPU speed and RAM size. I discovered that the new machine had SATA-2 while the 'older' machines only had SATA-1. However, these differences could not explain the huge discrepancy between the two systems. I suspected two things: (a) The on-board SATA controller was under-performing, or (b) the hard disk was under-performing. Running badblocks on the hard disk showed nothing.

I got another hard disk installed in the older machine (same controller) today, partitioned, formatted, mounted, and checked-out the same source tree, and the results:
  1. First svn update: 1m (this is fine, since the partition was freshly mounted)
  2. 2nd svn update: 1s !!!
  3. 3rd svn update: 0.7s !!!

So, (a) controller is alright, so (b) had to be somewhat true. I remember that I had partitioned the older machines way back in 2005, and the new one in 2006/2007. Surely there had to be some changes either in the ext2/ext3. Using dumpe2fs indeed showed an obvious difference - one of those head-slapping moments ... the journal size on the old machine was 32MB (on a 150GB partition) while the newer machine had it at 128MB.

I reclaimed the original swap partition on the original hard disk, reformatted with the latest mke2fs (-j option) and re-tested. Speedy gonzales! Mounting the partition as ext2 also had no speed issues. I re-created the journal on the compile partition (the 150GB one) with 'tune2fs -O ^has_journal' followed by 'tune2fs -j -J size=128' and voila! excellent performance. While at 1.5Gbps, is not as fast as SATA-2's 3Gbps, it is good enough for me.

During lunch the same day, I expanded the journal of the other old machine still running 2.4.33 and had the same performance boost already. My colleagues came to feedback that the system felt faster and more responsive.

And I agree!


I later learnt, that setting up and using NFS was still not as spiffy as using rsync with ssh. Using Rsync, you just needed to install and run an SSH server. No complex configuration files to mess around with, because SSH configuration files work well right out of the box.

1 comment:

Anonymous said...

This is some of the most hideous frankenstein amateur sys-admin work I've ever heard of. Congratulations for proving what every professional sysadmin knows: Never give the developers HW or root level access.