erez_ez has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I would like to write a script that whenever i run it, will go over a text file and will delete lines from it which exist more than 48 hours. The text file format is: name1/name2/name3/12-08-2007 10:14:30\n name4/name5/name6/11-08-2007 18:22:19\n etc... The date and time shwon in each line refers to the time the line was created. Need your help, thank you.

Replies are listed 'Best First'.
Re: Delete files by clock
by graff (Chancellor) on Aug 16, 2007 at 07:40 UTC
    Have you written any perl scripts yet? Have you tried to write any code for this task (even pseudo-code)?

    The basic plan for what you want to do would probably be easiest if you use the "strftime()" function provided by the POSIX module. (This module is part of the "core" distribution for Perl -- every Perl installation has it.)

    Compute the number of seconds in 48 hours, subtract that value from the current "seconds since the epoch" returned by the "time()" function, and use "strftime()" to convert the result into a date/time string of the form you need:

    $two_days_ago = time() - 2 * 24 * 60 * 60; $date_string = strftime("%m-%d-%Y %H:%M:%S", localtime($two_days_ago)) +;
    Once you have that, just open the current file for input, open a new file for output, then iterate reading a line at a time from the input, but don't write to the output until you see a line that contains a date equal to or greater than $date_string.

    (update: well, that "equal-to-or-greater-than" part is tricky, given that the string is "MO-DY-YEAR HR:MI:SC" -- you'll probably want to convert that to "YEAR-MO-DY HR:MI:SC" (s/(\d{2})-(\d{2})-(\d{4})/$3-$1-$2/;) so that you can do a simple "ge" or "le" string comparison. There are also a few different Date::* modules that you might find useful.)

    All remaining input lines get written to the output, you close the files, and rename the new one to whatever the old one was called (that deletes the old one).

    (updated to add a couple more links to documentation and a snippet to change the date format of the file data.)

    ... Sorry about all these updates... If the lines in the data file are not in chronological order (your two lines of sample data seem to be out of order), you'd need to check the date on each line from start to finish, to decide whether to write it to the output.

      First of all, thank you for your quick response. Second, yes, i have written several scripts before. I just had a problem with dates close to the end of the month. Like when you write a line to the text on the 30th of a certain month and than trying to delete it on the 1st of the next month. This problem repeats itself with end of days, month or year. Anyway, i'll try using your solution. thanks again.
Re: Delete files by clock
by andreas1234567 (Vicar) on Aug 16, 2007 at 08:18 UTC
    If you allow me to redefine your problem:

    Use Log::Log4perl with Log::Dispatch::FileRotate and make sure you rotate your files every 48 hours. That way you can be sure all lines in a file are written within the wanted 48 hour timespan.

    --
    Andreas
Re: Delete files by clock
by jbert (Priest) on Aug 16, 2007 at 10:24 UTC
    The approach I'd use would be to use a regex or a module (e.g. Date::Parse) to parse your date-time string into (year/month/day/hour/min/sec pieces), and them use POSIX::mktime to convert this into a unix time_t value (this measures the number of seconds since 'epoch', 1 Jan 1970).

    You then only keep a line if it's timestamp is more recent than time() - $NUM_OF_SECOND_IN_TWO_DAYS.

    Doing it this way avoids all end-of-month/-year etc. problems.