Re: regex, pos, \G, and substr

This seems somewhat simpler, though you might want to strengthen the regex to validate the input more.

#! perl -slw
use strict;
my $data_stg
    = 'junk text update 8923 mark complete update 8324 mark '
    . 'complete more junk update 5438 and 5843 and 1522 mark '
    . 'complete update 8435 and 9323 mark complete true junk'
;

$data_stg =~ s[update (.+?) mark complete]{
    join ' ', map{ "update $_ mark complete"} split '\s+and\s+', $1
}ge;

print $data_stg;
[download]

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Comment on Re: regex, pos, \G, and substr Download Code

Replies are listed 'Best First'.
Re^2: regex, pos, \G, and substr by ff (Hermit) on Jun 03, 2007 at 03:02 UTC
I think it's perceptive to split the guts of the phrases on `[ ]and[ ]`, but it's really important in my case that the leftovers are only digits. While I could throw `grep { /^\d+$/ }` in front of the `split`, I'd lose visibility to any non-digit stuff that was (mistakenly) there in the process of following through with the replace side of the `(s)ubstitute` operator. In other words, I'd rather leave everything alone if there's anything "non-digit" besides the `and` splitters in there. BTW, I like the single-quotes for delimiting the split regex.	[reply] [d/l] [select]
Re^3: regex, pos, \G, and substr by BrowserUk (Patriarch) on Jun 03, 2007 at 03:34 UTC
That's what I meant by strengthening the regex. Note that the non-conformant additional third line is left untouched: #! perl -slw use strict; my $data_stg = 'junk text update 8923 mark complete update 8324 mark ' . 'complete more junk update 5438 and 5843 and 1522 mark ' . 'complete more junk update junk and 5843 and 1522 mark ' . 'complete update 8435 and 9323 mark complete true junk' ; $data_stg =~ s[update ((?:\d+\|\s\|and)+) mark complete]{ join ' ', map{ "update $_ mark complete"} split '\s+and\s+', $1 }ge; print $data_stg; __END__ ## Output wrapped to match input for easier verification. junk text update 8923 mark complete update 8324 mark complete more junk update 5438 mark complete update 5843 mark complete + update 1522 mark complete more junk update junk and 5843 and 1522 mark complete update 8435 mark complete update 9323 mark complete true junk [download] Alternatively, verify that the split values are numeric, produce a warning and put the original back if not: #! perl -slw use strict; my $data_stg = 'junk text update 8923 mark complete update 8324 mark ' . 'complete more junk update 5438 and 5843 and 1522 mark ' . 'complete more junk update junk and 5843 and 1522 mark ' . 'complete update 8435 and 9323 mark complete true junk' ; $data_stg =~ s[(update (.+?) mark complete)]{ my @numbers = split '\s+and\s+', $2; if( grep{ !/^\d+$/ } @numbers ) { warn "Malformed request: '$1'\n"; $1; } else{ join ' ', map{ "update $_ mark complete"} @numbers; } }ge; print $data_stg; __END__ ## Output wrapped to match input for easier verification. Malformed request: 'update junk and 5843 and 1522 mark complete' junk text update 8923 mark complete update 8324 mark complete more junk update 5438 mark complete update 5843 mark complete + update 1522 mark complete more junk update junk and 5843 and 1522 mark complete update 8435 mark complete update 9323 mark complete true junk [download] BTW, I like the single-quotes for delimiting the split regex. Most don't. They consider it a bad habit of mine. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l] [select]
Re^3: regex, pos, \G, and substr by ysth (Canon) on Jun 04, 2007 at 01:02 UTC
I'd rather leave everything alone if there's anything "non-digit" besides the and splitters in there. Then leave that part the same as in your original looping regex: `s[update (\d+(?: and \d+)+) mark complete]{...}ge;` [download]	[reply] [d/l]