That was surprising. It appears that Text::Balanced alters some of the magic innards of the variable. I changed your code a bit:
sub eteb {
my $data = shift;
my $orig = $data;
my @array = extract_multiple(
$data,
[ sub{extract_tagged($_[0], 'a', 'b', undef,)}, ],
undef,
1
);
print "data='$data'\n";
display('eteb1', @array);
@array = extract_multiple(
$orig,
+
[ sub{extract_bracketed($_[0], '()')}, ],
undef,
1
);
display('eteb2', @array);
}
And get the desired results. The funny thing is, I was expecting that $data would be empty after the call or something, but was surprised to see that the value looked unchanged. I then changed the second extract_multiple to:
@array = extract_multiple(
$data."",
+
[ sub{extract_bracketed($_[0], '()')}, ],
undef,
1
);
and it worked as you expect. I haven't read the Text::Balanced docs to see if it's expected behaviour or not. But if it isn't, you may want to file a bug report on it.
Update: I remember a module (Devel::Peek) that lets you look at the magic goo inside of variables, so I changed your program to look at the $data variable before and after the call:
sub eteb {
my $data = shift;
my $orig = $data;
Dump($data);
my @array = extract_multiple(
$data,
[ sub{extract_tagged($_[0], 'a', 'b', undef,)}, ],
undef,
1
);
Dump($data);
$data = $data."";
Dump($data);
print "data='$data'\n";
+
display('eteb1', @array);
@array = extract_multiple(
$data,
[ sub{extract_bracketed($_[0], '()')}, ],
undef,
1
);
display('eteb2', @array);
}
<c>
<p>And sure enough, some stuff inside changed:</p>
<c>
$ perl 1005372.pl
et:a)(b
eb:(a)(b)
SV = PV(0x8458478) at 0x84fe160
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0x83c47b8 "(a)(b)"\0
CUR = 6
LEN = 8
SV = PVMG(0x83ef3a8) at 0x84fe160
REFCNT = 7
FLAGS = (PADMY,SMG,POK,pPOK)
IV = 0
NV = 0
PV = 0x83c47b8 "(a)(b)"\0
CUR = 6
LEN = 8
MAGIC = 0x83c2ec0
MG_VIRTUAL = &PL_vtbl_mglob
MG_TYPE = PERL_MAGIC_regex_global(g)
MG_LEN = 5
SV = PVMG(0x83ef3a8) at 0x84fe160
REFCNT = 7
FLAGS = (PADMY,SMG,POK,pPOK)
IV = 0
NV = 0
PV = 0x83c47b8 "(a)(b)"\0
CUR = 6
LEN = 8
MAGIC = 0x83c2ec0
MG_VIRTUAL = &PL_vtbl_mglob
MG_TYPE = PERL_MAGIC_regex_global(g)
MG_LEN = -1
data='(a)(b)'
eteb1:a)(b
eteb2:(a)(b)
After seeing this, I reviewed the docs for Text::Balanced, and noticed this:
Note that in a list context, the contents of the original input text (the first argument) are not modified in any way.
However, if the input text was passed in a variable, that variable's pos value is updated to point at the first character after the extracted text. That means that in a list context the various subroutines can be used much like regular expressions. For example:
In short, it's supposed to do that. That way it's ready to pull out the *next* bits of balanced text for you. Appending a null to the end of the string simply resets the string.
Sigh! Had I read the docs before playing with the code, I'd've saved myself a little time. Ah, well...
...roboticus
When your only tool is a hammer, all problems look like your thumb. |