in reply to Re^2: Text::Balanced question
in thread Text::Balanced question

You are (semi)hosed, as far as extract_bracketed() is concerned. The problem with backslashes is that they're used to escape things. That lets you represent non-printable characters, such as \n or \t, in a printable manner, as well as allowing one to say things like:

    my $s = "This string \" has an escaped quote";

If you then print $s, you get:

    This string " has an escaped quote

If you are trying to match balanced quotes on that string, you need to skip over the escaped quote inside. This is one of the reasons responders to your original post suggested that parsing balanced thingies is difficult to do with regular expressions.

Looking at the source code in Text::Balanced, there is, indeed, a line that always eats the next character following a backslash:

    next if $$textref =~ m/\G\\./gcs;

Your sample suggests that the input is using backslashes as some form of quoting operator, rather than an escape character. If that's a true assumption, then you might try normalizing your input to change the backslashes into something else (and then back again after you're done parsing):

For example:

$string = 'Param1(TYPE,\abc\),Param2(TYPE,\abc\)'; $string =~ tr{\\}{:};

Cheers,

Replies are listed 'Best First'.
Re^4: Text::Balanced question
by embirath (Novice) on Oct 21, 2006 at 04:58 UTC
    Thanks a million! Got it to work by replacing the backslashes by "::". :-)

    Emma