Extracting Line from Text file over 45 days old

chris654 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Extracting Line from Text file over 45 days old by BrowserUk (Patriarch) on Jan 12, 2008 at 06:23 UTC
Since your date field is alpha-sortable, there is no point in parsing, splitting and converting all the dates in your file. It is far simpler and much faster to convert the target date (45 days ago) to the same format: `my @bits = split ' ', localtime( time() - ( 45 * 24 * 60 * 60 ) );; my $n = 1; my %months = map{$_, $n++} qw[Jan Feb Mar Apr May Jun Jul Aug Sep Oct +Nov Dec];; my $targetDate = join '-', $bits[ 4 ], $months{ $bits[ 1 ] }, $bits[2] +;; print $targetDate;; 2007-11-28` [download] You can now just use a string compare to select the records and avoid the splits and conversions. This pseudocode assumes that the records in your DB file are ordered correctly. It also assumes that the last field of each record is always 4 digits: `open OLDDB, '<', $dbname or die ...; open NEWDB, '>', $tempfile or die ...; open ARCHIVE, '>>', $archive or die ...; print ARCHIVE while defined( $_ = <OLDDB> ) and substr( $_, -17, 10 ) lt $targetDate; print NEWDB; ## Output first 'failing' record to newdb print NEWDB while <OLDDB>; close for OLDDB, NEWDB, ARCHIVE; unlink $dbname; rename $tempfile, $dbname;` [download] Even if the above assumptions are incorrect, it will still be quicker to convert the target date to a string once, than convert every record date to an integer. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l] [select]
Re: Extracting Line from Text file over 45 days old by McDarren (Abbot) on Jan 12, 2008 at 05:07 UTC
Howdy :) This is quite simple. Because your data is regular, you can use split to extract the date/time stamp from each record. Then you can use something like Date::Parse to convert the date/time stamp into a unix timestamp. After that, it's just a bit of simple arithmetic. Here is some example code to demonstrate: #!/usr/bin/perl -l use strict; use warnings; use Date::Parse; my $cutoff_date = time - (45 * 86400); while (my $line = <DATA>) { chomp($line); my $date = (split /\\|/, $line)[15]; my $unixdate = str2time($date) or next; print "I would ", $unixdate < $cutoff_date ? "archive" : "not arch +ive", " $date"; } __DATA__ type\|address\|city\|state\|size\|rent\|term\|company\|contact\|phone\|email\|web +site\|ID\|REMOTE_ADDR\|HTTP_USER_AGENT\|DATE\|SORTORDER MFG\|10 Oakmead Pkwy\|San Jose\|CA \|10,000\|$1.25\|Net\|Cushman Wakefield\|To +m Cushman\|408 555-8777\|tom@cushman.com\|http://www.google.com\|1000000\| +76.102.98.35\|Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET +CLR 2.0.50727)\|16:01:08 2007-01-10\|1003 LAND\|10 Oak\|San Jose\|CA\|14,000\|$1.25\|Net\|Ritchie Commercial\|Don Ritchi +e\|408 555-8777\|tom@cushman.com\|http://www.yahoo.com\|1000001\|76.102.98 +.35\|Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.5 +0727)\|16:28:03 2008-01-10\|1002 [download] Which prints: `I would archive 16:01:08 2007-01-10 I would not archive 16:28:03 2008-01-10` [download] Hope this helps, Darren :)	[reply] [d/l] [select]
Re^2: Extracting Line from Text file over 45 days old by chris654 (Initiate) on Jan 12, 2008 at 06:12 UTC
I replaced <data> with database.txt but when I ran it got the following error Can't locate Date/Parse.pm in @INC (@INC contains: /usr/lib/perl5/5.8.0/i386-linux /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl .) at ./over45.sh line 4. BEGIN failed--compilation aborted at ./over45.sh line 4. Is Date::Parse not supported on my virtual server or is something else wrong. Thanks, Chris	[reply]
Re^3: Extracting Line from Text file over 45 days old by McDarren (Abbot) on Jan 12, 2008 at 17:02 UTC
That error most likely means that the Date::Parse module is not installed. Installing it should be as simple as: `perl -MCPAN -e "install Date::Parse"` [download] Cheers, Darren :)	[reply] [d/l]
Re^4: Extracting Line from Text file over 45 days old by twotone (Beadle) on Jan 13, 2008 at 07:43 UTC
Re: Extracting Line from Text file over 45 days old by dragonchild (Archbishop) on Jan 12, 2008 at 04:58 UTC
You want to take a look at the split function, the DateTime module, and you will probably want to use the following as a way of looping through the file: `open my $fh, '<', $filename or die "Cannot open file '$filename' for reading: $!\n"; while ( my $line = <$fh> ) { # Do stuff here. } close $fh;` [download] My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply] [d/l]
Re^2: Extracting Line from Text file over 45 days old by shmem (Chancellor) on Jan 13, 2008 at 01:26 UTC
You want to take a look at the split function, the DateTime module Seeking a way to tackle the problem, what exactly would be the OP's benefit installing 17 modules? --shmem _($_=" "x(1<<5)."?\n".q·/)Oo. G°\ / /\_¯/(q / ---------------------------- \__(m.====·.(_("always off the crowd"))."· ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}	[reply]
Re^3: Extracting Line from Text file over 45 days old by dragonchild (Archbishop) on Jan 13, 2008 at 03:04 UTC
That there are 17 modules to be installed is the problem of the cpan script, not the OP. We have computers do repetitive things because they're repetitive. I am completely baffled by this "There's too many modules involved!" concern. Do you pay for storage by the kilobyte/hour? I have yet to have a problem and I generally have between 10 and 30 Perl installations on any given machine. All of those will generally take up 2-3G. My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply]
Re^4: Extracting Line from Text file over 45 days old by BrowserUk (Patriarch) on Jan 14, 2008 at 21:19 UTC