OpenVZ forced umount of lustre mount problem

Recently we managed to find an answer to a quite worrying lustre problem that has been bugging us for some time. Every now and then on servers running OpenVZ containers that make use of lustre filesystem we would see a log entry in /var/log/messages saying:

kernel: Lustre: setting import lustre-server-MDT0000_UUID INACTIVE by administrator request

followed by a number of broken mounts/fs errors inside containers running on the server that the log entry appeared. In effect, all the containers making extensive use of the same lustre server would stop working properly (for example, apache serving sites from lustre mounts would start spawning processes all of which would be unsuccessfully trying to read data from the mounts).

Adventures with Lustre

For the last few months we’ve been busy integrating, testing and tuning Lustre for use on our hosting platform. I thought I’d share some notes…

Preparing Procurve Switches for Production

.htaccess revisited

About a year ago Dawid posted about the performance of .htaccess files. We decided to revisit these tests to compare the performance of .htaccess files on local disk and network filesystems. The network filesystem we used was Lustre, chosen partly because we are doing a lot of testing with Lustre at the moment but also because of its’ known issues with metatdata and small file performance.

Getting to know Lustre

Process list for OpenVZ containers (vztop)

The standard linux task list shows you each process and the resources (e.g. CPU, Memory) that they are consuming. However, if you run top on an OpenVZ host server, it doesn’t show you the container ID of each process.

Choosing the right switch for storage

Memtest Over Network

One rainy day we had a bad feeling that one of our rackable servers had corrupted memory. The server had some intermittent stability issues and hanged from time to time. It prompted us to test memory on all rackable servers in the rackable rack, especially that they were soon to become part of our hosting infrastructure.

Resolving bind zone transfer issues

Today, a fax machine at the office started complaining that it couldn’t send emails. No useful error messages or anything…

