NERDVANA has asked for the wisdom of the Perl Monks concerning the following question:
So, in that context, is there any way to use "pregexec" to apply the perl regex engine to my buffer but prevent it from making copies into perl-owned buffers? pregexec seems a bit under-documented... It says "described in perlreguts" but while that has pages upon pages of the inner workings of the regex engine, it doesn't even tell the meaning of the return value of pregexec or explain exactly what the final "nosave" parameter means. Ideally it would be a flag that does exactly what I want and avoids copying any buffers into any global variables, but that doesn't seem to be the case from looking at the C code. (which I admit I haven't taken the time to fully understand yet)
I'd also be OK if it made copies, but someone could tell me a reliable way to go zero out the buffers of those SVs so that all the captures magically appear to be full of NUL characters afterward.
Basically I'd like it to behave like standard C library regexec that just records positions of the capture groups in an array. I'm also debating if I should just use libc's regexes and declare that limitation on the SecretBuffer API, that you have to restrict yourself to Posix extended regex notation.
Update:
So actually the "nosave" parameter does appear to do some of what I want. Setting that flag prevents any of the magic variables from getting updated.
perl -e 'use v5.40; use Inline C => q{ int call_pregexec(SV *regex, SV *sv) { REGEXP *rx= SvRX(regex); STRLEN len; char *buf= SvPV(sv, len); return pregexec(rx, buf, buf+len, buf, 0, sv, 1); } }; say "098mnb" =~ /([0-9])([a-z])/; say call_pregexec(qr/([a-z])([0-9])/, "abc123"); say $&; say $1; say $2; say $+[0]; say $+[1];'
but then, the question becomes how to find out *where* the regex matched, since it didn't update any of the output variables.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Can I (with XS) invoke the regex engine without making copies of the buffer?
by dave_the_m (Monsignor) on Nov 10, 2025 at 13:03 UTC |