ddrumguy has asked for the wisdom of the Perl Monks concerning the following question:

I need to put together a script that will read through a logfile that begins each line with a date entry and only keep the last 30 days of entries and then archive the rest at the same time not losing any writes to the log as it is up at all times.

my date format is : Sat Aug 31 23:13:46 CDT 2002 can someone give me some pointers on how to accomplish the above? Thanks Bob

Replies are listed 'Best First'.
Re: cleaning up logs
by fruiture (Curate) on Oct 07, 2002 at 16:59 UTC

    You aske for pointers, here you are:

    • replace the logfile with a new, empty one before processing to avoid collisions with programs that write to the file.
    • open 2 other files to write the new stuff and the old stuff into.
    • use a regular expression for your date (if it's reliably fixed) and the Time::Local to convert it to seconds and to compare it with time(). Based on that difference you can decide whether something is new or old and then put the stuff where it belongs.
    • Delete the file you were working on.
    • block the active logfile for a short moment and put its contents under the "active stuff" in the "active stuff" file and again replace the active logfile with the "active stuff" file.

    You'll need open,close,flock,time,Time::Local,rename and probably File::Copy for that. HTH

    --
    http://fruiture.de
Re: cleaning up logs
by gnu@perl (Pilgrim) on Oct 07, 2002 at 17:31 UTC
    You can edit the file in place using this:
    syslog before the following command: Oct 7 08:30:01 deneb sendmail[13183]: [ID 801593 mail.info] g97CU1r13 +183: from= cmjohnso, size=252, class=0, nrcpts=1, msgid=<200210071230.g97CU1r1318 +3@deneb.us lec.net>, relay=cmjohnso@localhost Oct 7 08:30:01 deneb sendmail[13185]: [ID 801593 mail.info] g97CU1r13 +183: to=cm johnso, ctladdr=cmjohnso (10052/1), delay=00:00:00, xdelay=00:00:00, m +ailer=loca l, pri=120252, relay=local, dsn=2.0.0, stat=Sent % perl -i -p del.pl /var/log/syslog syslog after the above command: cmjohnso@deneb$ cat syslog Oct Oct
    Here is a code example for del.pl:
    #!/usr/bin/perl -w use strict; my @line = split(" ",$_); print "$line[0]\n"; $_ = '';
    Remember, when you are in the del.pl program you are only working on one line at a time, the 'perl -i -p <progrname> <file>' line reads each line of the file, in this case /var/log/syslog and applies the code in the program to each line, printing whatever you print in the code as well as the final content of $_. That is why I set $_ to '', so it did not print back to my file. It would have been just as east to set $_ to what I wanted to be printed to the file and not done the print myself.

    You could write a program somewhat similar to this that evaluated each line and if it needs to be deleted from the file just set $_ to '' otherwise leave $_ alone and it will go back to your final file as it was when it entered the loop.

    Here is the text from the "Perl Black Book" by Coriolis:

    "Perl lets you make changes to files in-place--that is, make changes to the file directly, without having to explicitly read it in and write it out. To edit files in-place, you use the -i switch with Perl; this switch specifies that files processed with the <> construct to be edited in-place. ....... Note also the -p switch; this switch makes Perl use a while (<>) and print loop around your script to print the changed text back to the file."

Re: cleaning up logs
by BrowserUk (Patriarch) on Oct 07, 2002 at 18:55 UTC

    Effectively, this isn't possible without either stopping (or pausing briefly) the program doing the logging or modifying it so that it would create a new file if you rename the one its using - assuming that is possible under your OS.

    What you are asking for is to be able to delete lines from the front of the file whilst allowing them to be written to the end. Whilst I vaguely recall seeing this facility on a mainframe OS once, if your running under Win32 or *nix, I am not aware of any filesystem that allows this type of operation. I'll no doubt be corrected if I got this wrong.

    Even if your OS/FS allows you to rename the file out from under the program, which is unlikely unless the program opens/writes/closes the log file each time, you would still end up with only the new lines in the file rather than 30 days worth + new.

    Although the perl -i switch mentioned above saves you from explicitly having to open the file, it is still opened . In fact what actually happens is that the file is renamed and then a new file is created with the original name, the renamed file is then read and any lines you choose to print will be written to the new one. This doesn't work if the file is already opened.

    It is possible, under Win32 for sure and almost certainly under *nix, to take a copy of an opened file. You could then process this file by archiving the old lines to one file and putting the last 30 days worth in yet another. Whilst you are doing this, the program doing the logging would continue to append to the original logfile. You then have the problem of copying any new lines from the original file to the end of your newly created 30days file, and then pursuading the first program to start using the new one, which is just the same problem again.

    The usual method of doing this kind of thing is to have the original program alternate between to log files every day, or every week and then your archiving program would process the currently static file whilst the other is being written. Of course, if the original program does use this technique, it would require modification. If this is possible, it is your only option that I can think of.


    Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!
      Thanks for pointing this out. I didn't think to add the fact that the file is still opened and any data that goes to it after that point will not go through the filter.

      The assumption I made was that the file was written to in a chronological order, so anything that was coming in while the script was running should be under the 30 day limit that was mentioned in the original post. This would make the point that the filter dosen't see this new data moot.

        My point was that using Perl -i on an open file won't work as it will try to rename that file and create a new one with the old name. The rename will fail.


        Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!