MonkPaul has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I am currenlt working on a web project where a file is created on a linux server when the user has entred their results.
If your familiar with the biology BLAST tool it works in the same format.

What i want to do now though, as with BLAST, is to delete the files after one day, so the files dont eventually swamp the server.
I have read the nodes about deleting files and so now have:

#!/usr/bin/perl -w use strict; use warnings; my $dir = "/home/march05/msc0516/public_html/Blast/updated/"; chdir $dir; opendir(DIR,"$dir") or die "Could not open $dir:$!"; while (my $file=readdir(DIR)) { my $age= -M $file; unlink $file if int($age) > 1; } closedir(DIR);

But as i have read, the -M option only looks for the time from when it was last modified. Can i therefore look for the creation date (understanding that there is no real time of creation format on a unix server), or am i still ok using this option.

Secondly: i want this deleteFile.pl script to delete the files automatically after 1 day and so dont want to have to invoke it myself. Is it possible to call the script each time the first web page is loaded or use the cron method to call it say everyday, though im not really sure how that works.

cheers,
MonkPaul

Replies are listed 'Best First'.
Re: Automatically delete files
by holli (Abbot) on Aug 02, 2005 at 14:41 UTC
    1. I think the "last modified" stamp is good enough, because nothing is going to change the files once they are created.
    2. You will have to edit /etc/crontab and insert your script there. Alernatively you can just let your script run in a loop and let it sleep a day.
    while (1) { #do stuff sleep (24*60*60); }


    holli, /regexed monk/

      Alernatively you can just let your script run in a loop and let it sleep a day.

      The only issue with this is, if the server is rebooted or that process dies for whatever reason. You will have to manually re-start it when and if you do notice that it is no longer running.

      If you do this (loop with a long sleep), don't forget that you will have to update $^T in order for -M to return relevant values:

      # update basetime: $^T = time();

      or else you won't discover any new expired files after the initial run.

      Yes, thanks mate. I didnt think of that but seems like a good solution.
Re: Automatically delete files
by sparkyichi (Deacon) on Aug 02, 2005 at 14:47 UTC
    This type of thing should go in cron.

    -A is last access and I think it is in epoc so I put it into an if statement:

    if (-A $file >= $timeplusoneday){ unlink $file; }


    Sparky
    FMTEYEWTK
      Ok, one thing that i forgot to mention though, for which i apologise, is that the user has a direct link to the file, where they click a hyper link and it opens in the browser, allowing them to save an updated version.

      With your suggestion, and a very valid one (lick butt here), is that if they access the file before the time expires it will stay on the server for another day, constantly.

      thanks though.

        The same would apply s/-A/-M/g

        Sparky
        FMTEYEWTK
Re: Automatically delete files
by bluto (Curate) on Aug 02, 2005 at 15:54 UTC
    A couple of minor nits. You should either skip '.' and '..', or better yet, make sure the entry you are looking at is not a directory. Also, you should probably check to make sure chdir succeeds or you might end up removing files in the current working directory.
Re: Automatically delete files
by davidrw (Prior) on Aug 02, 2005 at 15:19 UTC
    as mentioned already, last modified is sufficient (as opposed to actual created time). You can actually just put a unix find command in your crontab -- no real need for a perl script here.. run crontab -e and add a line like this (will do it at 1:30 am every day):
    30 1 * * * find /home/march05/msc0516/public_html/Blast/updated/ -mtim +e 1 -exec /bin/rm '{}' \;
    Do man 5 crontab for more details, and also man find.
      Thankyou,

      I think i will use the cron option, just in case as Data64 pointed out and sparkyichi was good enough to convey that i may have a problem if i use the sleep option and the server reboots.

      I have looked at cron through google and found some info about the * in the time and day etc but could you just explain the "30 1" and "-exec /bin/rm '{}' \;" parts.

      I get that -exec is trying to execute somthing but what, does this remove the dir, and if so can i put my perl script call in here instead so it reads:

      30 1 * * * -mtime 1 -exec/home/march05/msc0516/public_html/Blast/dele +teFiles.pl;
        The basic format of the crontab entry is
        minute hour day_of_month month day_of_week your_command_to_execute
        The '*' for the first five fields just means 'every'. So to run a command every day at 1:30 we use:
        30 1 * * * some_command
        See man 5 crontab for a fuller description of the values you can use in those first five columns.

        So now, the main part is what to put for "some_command".. i used this line, which is all part of a single find command:
        find /home/march05/msc0516/public_html/Blast/updated/ -mtime 1 -exec / +bin/rm '{}' \;
        This (man find) tells find to A) look in your Blast/updated/ directory, B) take files modified a day ago, and C) execute "/bin/rm FILENAME" for each one.

        If you wanted to simply cron your perl script instead, it would be something like:
        30 1 * * * /home/march05/msc0516/public_html/Blast/deleteFiles.pl
        Note also that the cron daemon will email you any output generated by the commands that are invoked by the crontab ... this is very handy for getting emails of errors.

        note that google(cron tutorial) will yield a slew of results.