lddzjwwy has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

now I want to read a logfile and print out all lines between one <xxx; and next <xxx;. But the problem is there are also < followed by empty line between the <xxx;, so it probably printed out only the lines between <xxx; and this <empty line. How can I figure it out? Following is part of my code:

while (<>) { # remove trailing whitespaces: s/\s+$//; # Remove CR and LF s/[\r\n]//g; if ((m/^<(.*)/) || (m/^\S+ <(.*)/)) { # store the previous command: if ($command ne "") { print "Adding: $command\n"; $response{$command} = $resp; $ordered{$command} = $result if $ordered_received; $resp = ""; $result = ""; $ordered_received = 0; $command = ""; } # a new command line: $command = $1; if ($command =~ /;/) { $command = uc($command) unless $case_sensitive; } } elsif ($ordered_received) { # collect the result lines $result .= "$_\r\n"; } else { # collect the response lines $resp .= "$_\r\n"; } }

crossposted to http://www.perl-community.de/bat/poard/thread/18345. Thanks a lot in advance. BR Wei

Replies are listed 'Best First'.
Re: Read in the logfile
by kcott (Archbishop) on May 27, 2013 at 09:00 UTC

    G'day lddzjwwy,

    The problem with your code is that ".*" matches zero or more characters; thus, "/^<(.*)/" matches "<" and "<xxx;".

    Here's a barebones script that captures the data you want. I'll leave you to format the output however you want it.

    #!/usr/bin/env perl use strict; use warnings; my $re = qr{^<(.+);}; while (<DATA>) { if (/$re/) { print_log_entry_header($1); } else { print; } } sub print_log_entry_header { print "*** Log Entry: @_ ***\n"; } __DATA__ <syrip:survey; ORDERED < EX-A 9UCIE6CG0CL235_____D331 AD-12 TIME 130429 1428 PAGE 1 SOFTWARE RECOVERY SURVEY EVENT TYPE EXPLANATION EVENTCNT FRDE +L 8 FORLOPP MANUALLY INITIATED FORLOPP RELEASE +1 7 LARGE OTHER-EEDWL 6 LARGE OTHER-STARTUP 5 LARGE INIT START/RESTART AFTER INIT START EVENT CODE INF1 INF2 INF3 INF4 SIDE STATE DATE TIME AC +T 8 H'310C H'0000 H'0F98 H'0001 H'0000 A SINGLE 130116 153704 NO 7 H'9003 H'000A H'0000 H'0000 H'0000 B SINGLE 120912 113918 NO 6 H'9003 H'000A H'0000 H'0000 H'0000 B SINGLE 120906 090851 NO 5 H'9004 H'0000 H'0000 H'0000 H'0000 B SINGLE 120905 171355 NO END EX-A 9UCIE6CG0CL235_____D331 AD-12 TIME 130429 1428 PAGE 1 <sastp; ......

    Output:

    $ pm_custom_file_split.pl *** Log Entry: syrip:survey *** ORDERED < EX-A 9UCIE6CG0CL235_____D331 AD-12 TIME 130429 1428 PAGE 1 SOFTWARE RECOVERY SURVEY EVENT TYPE EXPLANATION EVENTCNT FRDE +L 8 FORLOPP MANUALLY INITIATED FORLOPP RELEASE +1 7 LARGE OTHER-EEDWL 6 LARGE OTHER-STARTUP 5 LARGE INIT START/RESTART AFTER INIT START EVENT CODE INF1 INF2 INF3 INF4 SIDE STATE DATE TIME AC +T 8 H'310C H'0000 H'0F98 H'0001 H'0000 A SINGLE 130116 153704 NO 7 H'9003 H'000A H'0000 H'0000 H'0000 B SINGLE 120912 113918 NO 6 H'9003 H'000A H'0000 H'0000 H'0000 B SINGLE 120906 090851 NO 5 H'9004 H'0000 H'0000 H'0000 H'0000 B SINGLE 120905 171355 NO END EX-A 9UCIE6CG0CL235_____D331 AD-12 TIME 130429 1428 PAGE 1 *** Log Entry: sastp *** ......

    -- Ken

Re: Read in the logfile
by hdb (Monsignor) on May 27, 2013 at 08:26 UTC

    Would it be possible to show some sample input? Otherwise, in my understanding you are looking for something like this:

    use strict; use warnings; while(<DATA>) { last if /^<xxx;/; } while(<DATA>) { last if /^<xxx;/; print; } __DATA__ line1 line2 <xxx; line3 line4 <xxx; line5
      Hi, the input is sth. like:
      <syrip:survey; ORDERED < EX-A 9UCIE6CG0CL235_____D331 AD-12 TIME 130429 1428 PAGE 1 SOFTWARE RECOVERY SURVEY EVENT TYPE EXPLANATION EVENTCNT FRDE +L 8 FORLOPP MANUALLY INITIATED FORLOPP RELEASE +1 7 LARGE OTHER-EEDWL 6 LARGE OTHER-STARTUP 5 LARGE INIT START/RESTART AFTER INIT START EVENT CODE INF1 INF2 INF3 INF4 SIDE STATE DATE TIME AC +T 8 H'310C H'0000 H'0F98 H'0001 H'0000 A SINGLE 130116 153704 NO 7 H'9003 H'000A H'0000 H'0000 H'0000 B SINGLE 120912 113918 NO 6 H'9003 H'000A H'0000 H'0000 H'0000 B SINGLE 120906 090851 NO 5 H'9004 H'0000 H'0000 H'0000 H'0000 B SINGLE 120905 171355 NO END EX-A 9UCIE6CG0CL235_____D331 AD-12 TIME 130429 1428 PAGE 1 <sastp; ......
      Thanks a lot

        Just modify the regex to require at least one non-space after <

        use strict; use warnings; while(<DATA>) { last if /^<\S+;/; } while(<DATA>) { last if /^<\S+;/; print; # process your date here }
Re: Read in the logfile (data)
by Anonymous Monk on May 27, 2013 at 08:27 UTC
      Sry, I am really a newbie... I will add the link to the thread. Thanks for reminding.