jeiku has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am generating CSV event log files and some of the lines are not on one line, for example:

I want to read everyline that matches KB[0-9] and store that data on the end of the previous line and keep doing that until it reaches a line that does not match KB[0-9].

My code as usual is failing miserably. (in the example below I tried to read the KB lines into a temporary variable.

while (<__DATA__>) { # I don't know how to calculate the length $# of __DATA__ # so I just put 10 in there instead. for($i = 0; $i < 10; $i++) { if($data[$i] =~ m/KB[0-9]/) { do { push(@temp, $data[$i]); $i++; }while ($data[$i] =~ m/KB[0-9]/); } } # For some reason the loop is loading the whole file # into @temp print "@temp\n"; } __DATA__ "Information","17","4/13/2006 12:54:25 PM","Windows Update Agent","ISA +","Installation","N/A","Installation Ready: The following updates are + downloaded and ready for installation. To install the updates, an ad +ministrator should log on to this computer and Windows will prompt wi +th further instructions: - Security Update for Windows Server 2003 (KB911562) - Windows Malicious Software Removal Tool - April 2006 (KB890830) - Cumulative Security Update for Internet Explorer for Windows Server +2003 (KB912812) - Security Update for Windows Server 2003 (KB908531) - Cumulative Security Update for Outlook Express for Windows Server 20 +03 (KB911567)" "Information","17","4/13/2006 12:54:19 PM","Windows Update Agent","ISA +","Installation","N/A","Installation Ready: The following updates are + downloaded and ready for installation. To install the updates, an ad +ministrator should log on to this computer and Windows will prompt wi +th further instructions: - Security Update for Windows Server 2003 (KB911562) - Windows Malicious Software Removal Tool - April 2006 (KB890830) - Cumulative Security Update for Internet Explorer for Windows Server +2003 (KB912812) - Security Update for Windows Server 2003 (KB908531)"
Please help :(

Replies are listed 'Best First'.
Re: Problem with matching multiple lines
by perlsen (Chaplain) on Apr 19, 2006 at 08:34 UTC
    Hai, I think this could help you,
    while (<DATA>){ if($_ =~ m/KB([0-9]+)/) { push(@temp, $_); } } print "@temp\n"; __DATA__ "Information","17","4/13/2006 12:54:25 PM","Windows Update Agent","ISA +","Installation","N/A","Installation Ready: The following updates are + downloaded and ready for installation. To install the updates, an ad +ministrator should log on to this computer and Windows will prompt wi +th further instructions: - Security Update for Windows Server 2003 (KB911562) - Windows Malicious Software Removal Tool - April 2006 (KB890830) - Cumulative Security Update for Internet Explorer for Windows Server +2003 (KB912812) - Security Update for Windows Server 2003 (KB908531) - Cumulative Security Update for Outlook Express for Windows Server 20 +03 (KB911567)" "Information","17","4/13/2006 12:54:19 PM","Windows Update Agent","ISA +","Installation","N/A","Installation Ready: The following updates are + downloaded and ready for installation. To install the updates, an ad +ministrator should log on to this computer and Windows will prompt wi +th further instructions: - Security Update for Windows Server 2003 (KB911562) - Windows Malicious Software Removal Tool - April 2006 (KB890830) - Cumulative Security Update for Internet Explorer for Windows Server +2003 (KB912812) - Security Update for Windows Server 2003 (KB908531)"

    Regards
    perlsen
      Hi, I found this:

      http://rath.ca/Misc/Perl_CSV/index.shtml
      http://rath.ca/Misc/Perl_CSV/multiline-csv.tgz

      Which does exactly what I need... Thanks anyway :)
Re: Problem with matching multiple lines
by johngg (Canon) on Apr 19, 2006 at 08:52 UTC
    This script concatenates multi-line messages, printing them as it goes.

    use strict; use warnings; my $buffer = q(); while(<DATA>) { chomp; if(/KB[0-9]/) { $buffer .= $_; } else { print "$buffer\n" if $buffer; $buffer = $_; } } print "$buffer\n"; __DATA__ "Information","17","4/13/2006 12:54:25 PM","Windows Update Agent","ISA +","Installation","N/A","Installation Ready: The following updates are + downloaded and ready for installation. To install the updates, an ad +ministrator should log on to this computer and Windows will prompt wi +th further instructions: - Security Update for Windows Server 2003 (KB911562) - Windows Malicious Software Removal Tool - April 2006 (KB890830) - Cumulative Security Update for Internet Explorer for Windows Server +2003 (KB912812) - Security Update for Windows Server 2003 (KB908531) - Cumulative Security Update for Outlook Express for Windows Server 20 +03 (KB911567)" "Information","17","4/13/2006 12:54:19 PM","Windows Update Agent","ISA +","Installation","N/A","Installation Ready: The following updates are + downloaded and ready for installation. To install the updates, an ad +ministrator should log on to this computer and Windows will prompt wi +th further instructions: - Security Update for Windows Server 2003 (KB911562) - Windows Malicious Software Removal Tool - April 2006 (KB890830) - Cumulative Security Update for Internet Explorer for Windows Server +2003 (KB912812) - Security Update for Windows Server 2003 (KB908531)"

    I hope it is of use.

    Cheers,

    JohnGG

    Update:

    I looked at this again and wondered how it was printing the last log message when I had forgotten to do a final print statement after the while loop. It was working because there was a blank line at the end of the __DATA__ file (I nearly said Data Division thus revealing a murky COBOL past).

    Code now corrected.

Re: Problem with matching multiple lines
by izut (Chaplain) on Apr 19, 2006 at 09:33 UTC

    You can search for the string "(KB\d+)" at the end of line.

    while (<DATA>) { if (m/\((KB\d+)\)\s*$/) { print "$1\n"; } }
    Update: Sorry, I made a mistake here (0600 here, what do you think :)). You can use this approach:
    # Change default separator to \n" $/ = "\n\""; while (<DATA>) { # Put " in the beginning of line. $_ = qq/"$_/ unless m/^"/; # Removes \n s/\n/ /g; }

    Igor 'izut' Sutton
    your code, your rules.