lisaw has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that allows users to add information to a flat file db. The information added by users is kept on the file for 14 days. After the expiration time has been reached the data is removed. The removal process only occurs when new data information is added to the file. Is there a way to remove the expired data when the script is run as well as when data is added? Here's the code that removes expired data when new data is added:
open(DATAFILEIN, "$catagory.dat") || print "Your listing is the fi +rst in this category!<br>"; flock (DATAFILEIN, 2); while(<DATAFILEIN>) { @line_pair = split(/=/,$_); $time2 = $line_pair[6]; $currenttime = time (); $difference = $currenttime - $time2; #computing the number of seconds before it expires $expires = $expire_after_days * 86400; if($difference < $expires){ push (@temp,$_); } } flock (DATAFILEIN, 8); if (open(DATAFILEOUT, ">$catagory.dat") ) { flock (DATAFILEOUT, 2); print DATAFILEOUT "$time=$company_name=$email=$member1=$member +1phone=$data=$expiretime=$pictureurl=$password=$website=$member2=$mem +ber2phone=$address=$citystatezip=$fax=$catlisting\n"; print DATAFILEOUT @temp; userlog();
And here is the db viewing code that I'm trying to adjust to have expired information removed. I've attempted using the above script but only manage to remove all entries.
open(DATAFILEIN,"$catagory.dat") || print "This section is + currently empty...Please check back often!"; print "<table border=0 cellpadding=0 border=0>"; while (<DATAFILEIN>) { chomp $_; @line_pair = split(/=/,$_); $company_name = $line_pair[1]; $time = $line_pair[0]; $email= $line_pair[2]; $member1= $line_pair[3]; $member1phone= $line_pair[4]; $data= $line_pair[5]; $time2 = $line_pair[6]; $data = &stripBadHtml($data); $password= $line_pair[8]; $pictureurl= $line_pair[7]; $website= $line_pair[9]; $member2= $line_pair[10]; $member2phone= $line_pair[11]; $address= $line_pair[12]; $citystatezip= $line_pair[13]; $fax= $line_pair[14]; $catlisting= $line_pair[15];
Anyone have any suggestions? Thanks!! Lis

Replies are listed 'Best First'.
Re: Flat Database: Outdated Info Removal
by no_slogan (Deacon) on Nov 14, 2002 at 22:18 UTC
    flock (DATAFILEIN, 8); if (open(DATAFILEOUT, ">$catagory.dat") ) { flock (DATAFILEOUT, 2);

    Here you unlock the file, then open and clobber it, then lock it again. How do you know some other process didn't add new data to the file in between? Two better solutions would be:

    • Open the file read-write, with "+<$category.dat". Do all your reading, seek back to the beginning, truncate the file, and then write the new data back out.
    • Use a separate lock file, ">$category.lck". Don't unlock that file until your update is finished. With this solution, you can write the data back to a temporary file, $category.tmp, then rename it to $category.dat. That saves memory, because you don't need to store everything in @temp, but maybe your application doesn't have enough data for that to matter much.

    You shouldn't have any trouble getting the viewer to expire old data. You might want to read in all the data and do the necessary expirations first (the code for this part could be reused), and then check to see if there's anything left to display.

    Update: It would be a good idea to have the viewer get a read-lock on the file, too.

Re: Flat Database: Outdated Info Removal
by lisaw (Beadle) on Nov 14, 2002 at 22:22 UTC
    I figured it out on my own, using the following code: REVISED CODE
    open(DATAFILEIN,"$catagory.dat") || print "This section is + currently empty...Please check back often!"; print "<table border=0 cellpadding=0 border=0>"; flock (DATAFILEIN, 2); while(<DATAFILEIN>) { chomp $_; @line_pair = split(/=/,$_); $company_name = $line_pair[1]; $time = $line_pair[0]; $email= $line_pair[2]; $member1= $line_pair[3]; $member1phone= $line_pair[4]; $data= $line_pair[5]; $time2 = $line_pair[6]; $data = &stripBadHtml($data); $password= $line_pair[8]; $pictureurl= $line_pair[7]; $website= $line_pair[9]; $member2= $line_pair[10]; $member2phone= $line_pair[11]; $address= $line_pair[12]; $citystatezip= $line_pair[13]; $fax= $line_pair[14]; $catlisting= $line_pair[15]; $currenttime = time (); $difference = $currenttime - $time2; #computing the number of seconds before it expires $expires = $expire_after_days * 86400; if($difference < $expires){ push (@temp,$_); } } flock (DATAFILEIN, 8); if (open(DATAFILEOUT, ">$catagory.dat") ) { flock (DATAFILEOUT, 2); print DATAFILEOUT @temp; flock (DATAFILEOUT, 8); } open(DATAFILEIN,"$catagory.dat") || print "This section is + currently empty...Please check back often!"; print "<table border=0 cellpadding=0 border=0>"; while (<DATAFILEIN>) { chomp $_; @line_pair = split(/=/,$_); $company_name = $line_pair[1]; $time = $line_pair[0]; $email= $line_pair[2]; $member1= $line_pair[3]; $member1phone= $line_pair[4]; $data= $line_pair[5]; $time2 = $line_pair[6]; $data = &stripBadHtml($data); $password= $line_pair[8]; $pictureurl= $line_pair[7]; $website= $line_pair[9]; $member2= $line_pair[10]; $member2phone= $line_pair[11]; $address= $line_pair[12]; $citystatezip= $line_pair[13]; $fax= $line_pair[14]; $catlisting= $line_pair[15];
    This works great...but if someone else has a better solution please let me know.
      lisaw, if you're not going to follow no_slogan's excellent advice, then I strongly urge you to replace your flock(8) calls with close. Unlocking doesn't close, but closing unlocks. By leaving the file open but unlocked, you put the file at risk.

      jdporter
      ...porque es dificil estar guapo y blanco.

      In the quest of a better solution I would just offer to bang out a few dents to make thing look better.

      Your code that assign field values to individule vars is very hard on the eyes and could be difficult to debug if you get one of those index wrong. This is compounded by the list being out of order (1 before 0) and the subroutine call hidden in the batch.

      If you would keep this code tab it out so all the "=" and "$line_pair[]" are aligned. Either do the sub call right after the variable assignment and put some blank lines around it or just do after the whole transfer block.

      On the other hand if you don't need to do the transfer at all, all the better. If all that you do is a straight display of the vars in the same order they are in the file then you don't need to transfer them.

      If you need to manipulate a few of them them then create named indexes like (put near the top of your file)

      # indexes to $line_pair $i_data = 5; #add more here if needed.
      and then call the sub this way
      $line_pair[$i_data] = &stripBadHtml($line_pair[$i_data]);
      although this line is more complex there is only one of them, not 15. Also, without the transfer( it may still be needed if you are get the vars off an HTML form) the output line could go from:
      print DATAFILEOUT "$time=$company_name=$email=$member1=$member +1phone=$data=$expiretime=$pictureurl=$password=$website=$member2=$mem +ber2phone=$address=$citystatezip=$fax=$catlisting\n"
      to
      print DATAFILEOUT join("=",@list_pair),"\n";
      In general code should not be hard to look at or look like it was hard to type in. Someone will have to look back at the code again sometimes and it is good practice to make code clear for the next programmer to look at it.

      Remeber that next programmer may be you. Disclaimer

Re: Flat Database: Outdated Info Removal
by Three (Pilgrim) on Nov 15, 2002 at 16:44 UTC

    Why not put your deletion code into a separate program. Then use the power of the CRON job to run it at midnight every night.

    This would elevate overhead of insert and display of data.  It would also make sure that data is deleted in a timely manner.

    If you are running on a Windowz NT machine use the AT scheduling service.

    If you don't have either then check out this cron module for perl.