in reply to Not able to capture information

To append the orphaned lines I haves set up an array that can be be manipulated during the while sequence, which is then printed after processing.

This appends the ophaned lines as in the case provided

#!/usr/bin/perl -w use strict; my @linearr; while (<DATA>) { chomp; if($_ =~ m{\[(\d{4}/\d{2}/\d{2}\s+\d{2}\:\d{2}\:\d{2})\]\s+\[( +\d{1,3})\]\s+ERRORMSG\s+(.*)}) { my $date = $1; my $err_no = $2; my $err_msg = $3; push @linearr, $date.' === '.$err_no.' === '.$err_msg."\n"; }else{ $linearr[@linearr-1] =~ s/\n$/\ $_\n/;} } print @linearr;

prints

__DATA__ [2012/02/16 00:08:34] [29] ERRORMSG unknown error Can't insert into pr +ice table Please check Valueprice.pm line 52. [2012/02/16 00:08:34] [39] ERRORMSG Invalid User [2012/02/16 00:14:52] [105] ERRORMSG missing conversion rate [2012/02/16 00:14:52] [29] ERRORMSG Can't use an undefined value as a +HASH reference at Value.pm line 77.

Coyote

Replies are listed 'Best First'.
Re^2: Not able to capture information
by Marshall (Canon) on Feb 17, 2012 at 18:40 UTC
    Yes, yet another road to Rome!

    I would have written the code very slightly differently.
    (1) Rather than using $1,$2,$3, I would use list assignment of the variables. The match "worked" if the last one is "defined".
    (2) A complex regex of the date/time is not needed
    (3) In the substitution, I would use "|" as the separator to reduce the number of "leaning toothpicks" although some folks figure that this is a bad idea. mileage varies.

    #!/usr/bin/perl -w use strict; my @lines; while (<DATA>) { chomp; next if /^\s*$/; my ($date, $err_no, $err_msg) = m{\[(.*)\]\s+\[(.*)\]\s+ERRORMSG\s+(.*)}; if (defined $err_msg) # the match "worked"! { push @lines, $date.' === '.$err_no.' === '.$err_msg."\n"; } else { $lines[@lines-1] =~ s|\n$| $_\n|; } } print @lines; =prints 2012/02/16 00:08:34 === 29 === unknown error Can't insert into price t +able Please check Valueprice.pm line 52. 2012/02/16 00:08:34 === 39 === Invalid User 2012/02/16 00:14:52 === 105 === missing conversion rate 2012/02/16 00:14:52 === 29 === Can't use an undefined value as a HASH +reference at Value.pm line 77. =cut __DATA__ [2012/02/16 00:08:34] [29] ERRORMSG unknown error Can't insert into pr +ice table Please check Valueprice.pm line 52. [2012/02/16 00:08:34] [39] ERRORMSG Invalid User [2012/02/16 00:14:52] [105] ERRORMSG missing conversion rate [2012/02/16 00:14:52] [29] ERRORMSG Can't use an undefined value as a +HASH reference at Value.pm line 77.

      List assignment, also another good modification overlooked here. The difference between them being that undefined scalars are created, possibly unnecessarily, before each regexp test. Where in scalar assignment the match will have been tested before scalars are created. No biggie, but how would we go about making comparisons for such details? I would like to think on.

      I did consider amending toothpicks in the original regexp. But for time and the regexp was already dealt with by first response. I did not mind for my substitution as was a very short substition.

      Pipe is syntactically correct, but due to it's general usage I would probably pick a different symbol. Each to their own here.

        For timing comparisons, the Benchmark module is excellent.

        I tend to write the most straight forward code first and then tweak when necessary. I recently had a project where I changed just a single regex line and it cut 2 minutes off a 10 minute run time! If you do anything, no matter how simple enough millions of times, it adds up. Considerable experimentation can be required. In this particular case, I don't know without measuring what the performance difference would be.

        Anyway, I think the OP has a number of fine examples of different approaches.