in reply to log_rotate.pl

Depending on the software writing to the logs, just renaming the files isn't enough. A friend had an apache web server running, rotated the log file, then deleted it... forgetting to restart apache.

Come 4 months later the disk on the drive filled. He couldn't find any large files, just large directories. He forgot to restart apache, so it continued to write to the file handle, taking up disk space that was unreachable (by normal means) by any normal method.

Restarting apache freed a few gigs.

Replies are listed 'Best First'.
Re^2: log_rotate.pl
by jhourcle (Prior) on Apr 28, 2005 at 03:52 UTC

    Rotating log files is like making backup tapes -- It's only useful if you verify that they were successful.

    For instance, I looked over MacOSX's log rotation in periodic.weekly, and saw that it kept 5 backups for the webserver, and thought that'd be fine for a new webserver. (I had been administering iPlanet on Solaris up 'til that point, and was pleasantly surprised that log rotation had alreadyt been configured) So, a couple of weeks later, when my boss asked for some metrics, I was rather surprised to find that I only had one rolled file. Of course, it seems that periodic.daily was kind enough to remove any files that hadn't been modified in 7 days.

    I know -- you're wondering how that's similar to backups -- Well, in another situation, at a previous job, we were running SIMS (Sun Internet Mail System) on Solaris 7. We had weekly fulls, and nightly incrementals of all of our mail stores ... or so we thought. It seems that to make sure that SIMS didn't hit Solaris 6's 2GB max file size, it stopped writing backups at 2GB, and didn't give any warnings that it had done so. So of course, when we finally needed to restore data for a user (an assistant dean of one of the colleges, that the help desk told to set his 'save mail for ___ days' setting to 0, if he wanted to keep his mail forever (which of course, it didn't)) we found that the fulls weren't anywhere near complete.

    I guess the real moral to the story is -- if something's important, it's worth taking the time to double check things. Automation may make your life easier, but it also has the potential to make things a whole lot worse if you're not careful.

    Oh -- and in the case of apache, you don't typically need a full restart. You can use 'apachectl graceful' to HUP the children, and roll your logs.