Recently we managed to find an answer to a quite worrying lustre problem that has been bugging us for some time. Every now and then on servers running OpenVZ containers that make use of lustre filesystem we would see a log entry in /var/log/messages saying:
kernel: Lustre: setting import lustre-server-MDT0000_UUID INACTIVE by administrator request
followed by a number of broken mounts/fs errors inside containers running on the server that the log entry appeared. In effect, all the containers making extensive use of the same lustre server would stop working properly (for example, apache serving sites from lustre mounts would start spawning processes all of which would be unsuccessfully trying to read data from the mounts).