Last Tuesday my colo provider knocked the powercord from my machine there whilst doing some power work. Since then, although I didn't make the connection until just now, my nightly afs backups were taking much much longer than usual. Since I was also running out of space on the disk at home that holds one copy of the dumps, I just turned off backups until I could look at it tonight.
In investigation tonight I noticed that the BosConfig file was gibberish, and by looking in backups, I could tell it got messed up the day my machine fell over. I also had noticed that in the VolserLog there were a bunch of "trans X on volume Y is older than Z seconds", particularly on the volumes that change the most each day. A tiny voice in my head whispered " I bet I have a bunch of volumes that need salvaging.". Fortunately, at home a complete salvage of a fileserver takes less than a minute (unlike at work, where it takes a couple hours at minimum). It's only incidental evidence, but the making of the clone volume to back up my web page volume took a split second, instead of several seconds when I tried it earlier this evening.
So, in closing:
- Backups are good, okay. Thanks to my rsync backup system for stuff outside of afs, I could pinpoint to the day when BosConfig on service-5 changed, as well as easily restore it.
- Since salvages at home take no time, I should really turn off fast-restart there. I'm hoping this is a flag, instead of a compile-only option (I know you have to build it with a flag to turn that option on, I just hope that also enables a flag at runtime with which you can turn it off).
- I should really try the demand-attach and demand-salvage stuff in 1.5