periapt has asked for the wisdom of the Perl Monks concerning the following question:

OK, this is killing me. I have spent the last two days seemingly overlooking the obvious. I'm betting this is an obvious one as well but ... I don't get it. I have an error message like this
[Microsoft][ODBC SQL Server Driver][SQL Server][0122]USAGE: InvokeStor +edProcedure [param1], [param2], [param3], [param3]
I need to strip off the first three bracketed expressions and retain everyting from [0122] on. I have tried many variations on this theme but I seem to be missing something
$errmsg =~ /^ # start at beginning of string ( # capture first group \[ # an opening square bracket \w+ # followed by one or more word charrs ] # followed by a closing square bracket ) # close first capture {1} # capture should be 1 time ( # begin second capture .+ # match anything else 1 or more times ) # close second capture /x; # which gives $1 = [Microsoft], $2 = everything else
However, when I change the {1} to {2}, the match fails. I am expecting $1 = [Microsoft][ODBC SQL Server Driver], $2 = everyting else

Any suggestions? What am I missing?

PJ
use strict; use warnings; use diagnostics;

Replies are listed 'Best First'.
Re: Problem parsing an error msg with regex
by gothic_mallard (Pilgrim) on Oct 07, 2004 at 14:16 UTC

    Try this:

    $errmsg =~ /^((?:\[.+?\]){3})(.*)/;

    The first three brackets are in $1 and the rest of the string in $2.
    If all you want is the last part (after the 3 brackets) just do:

    $errmsg =~ /^(?:\[.+?\]){3}(.*)/;

    As a quick breakdown (of the first version, the same is the second but without the brackets around (?:\[.+?\]){3}).

    $errmsg =~ /^ # Start of string ( # Start match $1 (?: # Start group (don't save backreference) \[.+?\] # Match set of []'s ){3} # End group and repeat 3 times ) # End match $1 (.*) # Slurp up everything that's left /x; # Done

    --- Jay

    All code is untested unless otherwise stated.

      Great, thanks. This one works in a more general sense since I was restricting the stuff between the brackets a bit.

      PJ
      use strict; use warnings; use diagnostics;
Re: Problem parsing an error msg with regex
by ikegami (Patriarch) on Oct 07, 2004 at 14:07 UTC

    \w+ doesn't match spaces. Use [^\]]+ instead:
    $errmsg =~ /^((?:\[[^\]]+\]){3})(.*)/; or
    $errmsg =~ s/^(?:\[[^\]]+\]){3}//;

    Update: The directions were correct, but the code was missing a set of parens.

      Thanks for the idea. I'm afraid your regex didn't work but the problem was, indeed, the spaces. I rewrote the expression as
      $errmsg =~ /^ # start at beginning of string (?: # look at first group \[ # an opening square bracket [\w ]+ # followed by one or more word char/spaces ] # followed by a closing square bracket ) # end first group {3} # capture should be 1 time ( # begin second capture .+ # match anything else 1 or more times ) # close second capture
      and it works as needed. Thanks again

      PJ
      use strict; use warnings; use diagnostics;
Re: Problem parsing an error msg with regex
by Roger (Parson) on Oct 07, 2004 at 14:19 UTC
    Try the following more generic version which strips the first three maximum recursive match of square brackets...

    #!/usr/bin/perl -w use strict; my $re; $re = qr/ \[ # Opening bracket (?:(?: # Capture the content and then forget it [^\]\[]+ | (??{$re}) # Or recurse )+) # and allow repeats internally \] # Closing bracket /x; while (my $line = <DATA>) { $line =~ s/($re){3}//g; print $line; } __DATA__ [Microsoft][ODBC SQL Server Driver][SQL Server][0122]USAGE: InvokeStor +edProcedure [param1], [param2], [param3], [param3] [[B1] [B2]][[B3]][B4][B5]Stuff....


    And the output is as expected:
    [0122]USAGE: InvokeStoredProcedure [param1], [param2], [param3], [para +m3] [B5]Stuff....

Re: Problem parsing an error msg with regex
by Roy Johnson (Monsignor) on Oct 07, 2004 at 15:08 UTC
    Sometimes less is more. You don't have to worry about balancing the brackets, so you don't really care about the opening brackets. Just find the third closing bracket and split there:
    $_='[Microsoft][ODBC SQL Server Driver][SQL Server][0122]USAGE: Invoke +StoredProcedure [param1], [param2], [param3], [param3]'; my ($one, $two) = /((?:.*?\]){3})(.*)/; print "<$one>\n<$two>\n";

    Caution: Contents may have been coded under pressure.