comment on

This is related to my module Crypt::SecretBuffer, where the goal of the module is to prevent leaking secrets into the freed heap memory of the perl process where something could later come along and scan the address space looking for leftover secrets. The SecretBuffer tries to isolate data from being looked at by any normal Perl ops, so that it can be exclusively be viewed by XS/C functions and wiped clean when it goes out of scope.

So, in that context, is there any way to use "pregexec" to apply the perl regex engine to my buffer but prevent it from making copies into perl-owned buffers? pregexec seems a bit under-documented... It says "described in perlreguts" but while that has pages upon pages of the inner workings of the regex engine, it doesn't even tell the meaning of the return value of pregexec or explain exactly what the final "nosave" parameter means. Ideally it would be a flag that does exactly what I want and avoids copying any buffers into any global variables, but that doesn't seem to be the case from looking at the C code. (which I admit I haven't taken the time to fully understand yet)

I'd also be OK if it made copies, but someone could tell me a reliable way to go zero out the buffers of those SVs so that all the captures magically appear to be full of NUL characters afterward.

Basically I'd like it to behave like standard C library regexec that just records positions of the capture groups in an array. I'm also debating if I should just use libc's regexes and declare that limitation on the SecretBuffer API, that you have to restrict yourself to Posix extended regex notation.

Update:

So actually the "nosave" parameter does appear to do some of what I want. Setting that flag prevents any of the magic variables from getting updated.

perl -e 'use v5.40;
         use Inline C => q{
           int call_pregexec(SV *regex, SV *sv) {
             REGEXP *rx= SvRX(regex);
             STRLEN len;
             char *buf= SvPV(sv, len);
             return pregexec(rx, buf, buf+len, buf, 0, sv, 1);
           }
         };
         say "098mnb" =~ /([0-9])([a-z])/;
         say call_pregexec(qr/([a-z])([0-9])/, "abc123");
         say $&;
         say $1; say $2;
         say $+[0]; say $+[1];'
[download]

but then, the question becomes how to find out *where* the regex matched, since it didn't update any of the output variables.

In reply to Can I (with XS) invoke the regex engine without making copies of the buffer? by NERDVANA

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.