tuakilan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I put together following codes were put together with help from fellow monks ( thanks alot ! ).

The idea of the task is to :

  • read a log file, in this case, log.yyyy-mm-dd
  • comb thru the log file to look for records that contains the string 'two' and duration between the 2 records is more than 3600 seconds.
  • export the result into a report file, report.yyyy-mm-dd.txt
  • I have done this far and when i run the code with, it created an file with zero byte size.

    Any clue what went wrong with the script ?

    #!/usr/local/bin/perl -w use strict; use warnings; use DateTime::Format::Strptime; my $Strp = new DateTime::Format::Strptime(pattern => '%Y-%b-%d %T' +,); my $infile = 'log.2008-01-11'; my $outfile = 'report.2008-01-11.txt'; my($fh_out, $fh); my %lookup; my $channel = 'two'; my $time_delta = 3600; # seconds = 1 hour open($fh_out, '>', $outfile) or die "Could not open outfile: $!"; open($fh, '<', $infile) or die "Could not open logfile: $!"; while (<$fh>) { next unless /$channel/; my ($dt, $timestamp, $refs); if (m/^(.*) UTC.*ref .*? = (\d+)$/) { my $t = $1; $refs = $2; $dt = $Strp->parse_datetime($t); $timestamp = $dt->epoch(); warn "found: $. $timestamp\t$refs\n"; } else { warn "No match: $_ \n"; next; } if ( defined($lookup{$refs}) && $lookup{$refs} + $time_delta <= $t +imestamp ) { print $fh_out "REFS $refs: occurrences at " . $lookup{$refs} . + "and $timestamp \n"; print "REFS $refs: occurrences at " . $lookup{$refs} . " and $ +timestamp \n"; } $lookup{$refs} = $timestamp; } close $fh_out; #close $fh;

    content of log.2008-01-11

    2007-Jan-11 00:00:00 UTC (GMT +0000) - Poll: channel = one, ref = com, + id = 595143009 2007-Jan-11 00:00:01 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 133714761 2007-Jan-11 00:00:01 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 595131400 2007-Jan-11 00:00:02 UTC (GMT +0000) - Poll: channel = three, ref = co +m, id = 660868931 2007-Jan-11 00:00:02 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 595191883 2007-Jan-11 00:00:03 UTC (GMT +0000) - Poll: channel = one, ref = com, + id = 098533326 2007-Jan-11 00:00:03 UTC (GMT +0000) - Poll: channel = three, ref = co +m, id = 659718092 2007-Jan-11 00:00:04 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 006456768 2007-Jan-11 00:00:05 UTC (GMT +0000) - Poll: channel = three, ref = co +m, id = 133714761 2007-Jan-11 01:07:06 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 133714761

    Desired contents of report.2008-01-11.txt

    2007-Jan-11 00:00:01 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 133714761 0 2007-Jan-11 00:00:01 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 595131400 0 2007-Jan-11 00:00:02 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 595191883 0 2007-Jan-11 00:00:04 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 006456768 0 2007-Jan-11 01:07:06 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 133714761 1

    Replies are listed 'Best First'.
    Re: failure to generate report file
    by oko1 (Deacon) on Mar 05, 2008 at 14:08 UTC

      According to what you've asked, your desired output is

      2007-Jan-11 00:00:01 UTC (GMT +0000) - Poll: channel = two, ref = com, id = 133714761 0
      

      Your 'print' statement says:

      print $fh_out "REFS $refs: occurrences at " . $lookup{$refs} . "and $timestamp \n";
      

      You're never going to get the former from the latter. :)

      From what I can see, you want to capture the lines that contain 'channel = two' and you want to add a number to the end of those. Assuming that you want that number to show how many times that ID has been seen, the following would do it:

      #!/usr/bin/perl -w open In, "log.2008-01-11" or die "log: $!\n"; open Out, ">report.2008-01-11.txt" or die "report: $!\n"; while (<In>){ chomp; next unless /channel = two,.* (\d+)$/; print Out "$_ ", $seen{$1} || 0, "\n"; } close In; close Out;

      On the other hand, I've just woken up and may be missing something. :) Hope this helps.

        Close (and I do understand needing caffeine before coding) but Re: failure to generate report file doesn't deal with OP's request to capture only those with 3600 second separation. Others, however, have dealt with that.

        But (and as OP replied) $seen is used only once in your code. Minor mods correct that issue (and ignore the count):

        #!/usr/bin/perl -w open In, "log.2008-01-11" or die "log: $!\n"; open Out, ">report.2008-01-11.txt" or die "report: $!\n"; while (<In>){ chomp; next unless ($_ =~/channel = two.* (\d+)$/); print Out "$_ \n"; } close In; close Out;

        produces

        2007-Jan-11 00:00:01 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 133714761 2007-Jan-11 00:00:01 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 595131400 2007-Jan-11 00:00:02 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 595191883 2007-Jan-11 00:00:04 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 006456768 2007-Jan-11 01:07:06 UTC (GMT +0000) - Poll: channel = two, ref = com, + id = 133714761
        i paste the code in and tried to run it with perl -wc but the error as below
        Name "main::seen" used only once: possible typo at line 8
    Re: failure to generate report file
    by codeacrobat (Chaplain) on Mar 05, 2008 at 08:02 UTC
      Remove the $ sign at the end of the regex. On the line after ref = com there is the id=<id> assignment, which the regex fails to match. if (m/^(.*) UTC.*ref .*? = (\d+)/ Should fix it.
      I recommend debugging such bugs in the perl debugger perldebtut. Scroll stepwise through the Script via command 's' and print conditions captured variables via command 'p'. If the match is not as expected try to narrow the bug down with simpler regexes / conditions.
      Removed misleading advice. The .*? should jump the regex right to the id assignment.

      print+qq(\L@{[ref\&@]}@{['@'x7^'!#2/"!4']});
    Re: failure to generate report file
    by quester (Vicar) on Mar 05, 2008 at 07:22 UTC
      Actually, I'm not sure. When I run your script the report file contains this line:
      REFS 133714761: occurrences at 1168473601and 1168477626
      (You might want to throw in a space before the "and".)

      Just possibly, when your script runs it might not have the correct permissions to create a new file in the current directory.

        i used root profile to run the script :(
          Strange... when you run it, does it print that line to standard output? When I run your script, it looks like this...
          $ perl -w temp18.pl found: 2 1168473601 133714761 found: 3 1168473601 595131400 found: 5 1168473602 595191883 found: 8 1168473604 006456768 found: 10 1168477626 133714761 REFS 133714761: occurrences at 1168473601 and 1168477626 $ cat report.2008-01-11.txt REFS 133714761: occurrences at 1168473601and 1168477626