extract a log file to filter previous dates

pinpe has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have problem of filtering a log file from my perl script.

This is the file content of the file pinpe.csv:
2009-06-16
2009-01-29
2009-06-02
2008-03-05
2007-08-05

Here is my perl code.

use Net::Telnet;

@DATE=$telnet->cmd("date '+%Y-%m-%d'");
chomp($DATE[0]);

@Discon=$telnet->cmd("cat /data/pinpe.csv | nawk '{print \$4}' | perl 
+-ne 'print if \$_ lt $DATE[0]'");
chomp($Discon[0]); 

print "@Discon[0]\n";
[download]

and this is the output error:

Illegal octal digit '8' at -e line 1, at end of line
[download]

Now my dilemma is I want to filter this ouput by my preferred dates. I wanted it to output the date from yesterday and backwards. In other words I want to filter all the dates less than today's date (i.e. 2007-07-30, 2007-07-29, 2007-07-28, etc. and below). Tnx in advance. :) Was it perl thought that I was dealing with an octal number and when he came along with 8 which stopped it from making any sense, so perl quite rightly complained? Pls help. Tnx.

Br, Pete

Comment on extract a log file to filter previous dates Select or Download Code

Replies are listed 'Best First'.
Re: extract a log file to filter previous dates by GrandFather (Saint) on Aug 02, 2007 at 22:22 UTC
The key to your problem is 'interpolation'. The code that the Perl interpreter sees is: `print 2007-08-05;` [download] which is some arithmetic (2007 - 08 - 05) containing two octal constants and a decimal constant. You need to quote the interpolated date so that the interpreter sees a string instead: `@Discon=$telnet->cmd("cat /data/pinpe.csv \| nawk '{print \$4}' \| perl +-ne 'print if \$_ lt \'$DATE[0]\''");` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^2: extract a log file to filter previous dates by pinpe (Novice) on Aug 03, 2007 at 18:12 UTC
Hi GrandFather, Tnx so much indeed for the hand. My script is now working. Now my problem is the extarction of the file is taking slow. How can I optimized the extraction of 1.2GB .txt file with 8M rows and 38 columns. Actually I create this script to filter only the informations that I need from this huge data. Any modules that can help to speed up the extraction? Tnx in advance. Cheers! :) Br, Pete	[reply]
Re^3: extract a log file to filter previous dates by GrandFather (Saint) on Aug 05, 2007 at 03:22 UTC
You might get better advice if you show us what your data looks like and tell us what you want to extract from it. If you are dealing with a CSV file you may find Text::CSV or DBD::AnyData useful. AnyData will also handle fix width field data and a variety of other common file formats. DWIM is Perl's answer to Gödel	[reply]
Re^2: extract a log file to filter previous dates by pinpe (Novice) on Aug 03, 2007 at 18:36 UTC
Hi GrandFather, Tnx so much indeed for the hand. My perl script is working now. =) But now my problem is the extraction of data is taking too slow. You know my data is 1.2 GB text file with 8M rows and 38 columns. Imagine how huge this data is. Is there a way I can optimixed the extraction of the data. That is why I created this script to filter only those informations that I need. Is there any modules available or any way to speed up the process of extraction? Tnx in advance. Cheers! :) Br, Pete	[reply]
Re: extract a log file to filter previous dates by Fletch (Bishop) on Aug 03, 2007 at 00:50 UTC
Interpolation problems aside, you'd be much better off writing this as a single call to perl on the other side. `my @result = $telnet->cmd( qq{perl -MPOSIX=strftime -lane 'BEGIN{$date + = strftime( "%Y-%m-%d", localtime );} print $F[3] if $F[3] lt $date' + /data/pinpe.csv} );` [download] (Possibly seasoning with the appropriate `-F,` argument given the file extension; see perlrun for more details on autosplitting.)	[reply] [d/l] [select]
Re^2: extract a log file to filter previous dates by pinpe (Novice) on Aug 03, 2007 at 01:16 UTC
$F3 --> What is this for? qq{perl -MPOSIX=strftime -lane --> I'm not familiar with this but I'll give it a try. Tnx. Br, Pete	[reply]