Hi Aleena,
The extract_* functions are meant to operate on the start of a string, not from an arbitrary point. As mentioned in the Text::Balanced description, you may skip a prefix before the start of the balanced text, but by default this will only skip whitespace.
So if you were to change text to:
my $text = ' <bold>, I<italic>, and B<I<bold and italic>> text.';
Your output would be:
$VAR1 = [ '<bold>', ', I<italic>, and B<I<bold and italic>> text.', ' ' ];
Where the return is a triple of the bracketed text, the remaining string, and the prefix that was bypassed before the bracketed text was found.
If you leave your $text input as it was in your example but change the function call to consider everything preceding a < as a prefix:
You'll get:my @line = extract_bracketed($text, '<>', qr(.*?(?=<)));
$VAR1 = [ '<bold>', ', I<italic>, and B<I<bold and italic>> text.', 'A line with B' ];
Where the prefix is again everything before the <. but includes the bold code at the end, which you'd have to deal with appropriately.
HTHIn reply to Re^3: POD style regex for inline HTML elements
by Loops
in thread POD style regex for inline HTML elements
by Lady_Aleena
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |