You don't say how these sections of text are demarked, but assuming the matching requirements for the start and end of the demarked sections are reasonable, you can do what you need much more simply. Eg. the text of your post contains five sets of parens. If you consider those to be the untouchable text, and the rest is to be mangled, then this achieves that without all the messing about with placeholders. There are no restrictions on what mangling you do within the replace side of the s///, including using other regex safely, or calling a subroutine:
#! perl -slw use strict; my $data = do{ local $/; <DATA> }; my $start = '('; my $stop = ')'; $data =~ s[((?:^|\Q$stop\E).+?(?=\Q$start\E|$))]{ my $toModify = $1; $toModify = uc $toModify; $toModify; }seg; print $data; __DATA__ The text from your OP goes here.
Produces
C:\test>666970 HI, ALL: I'VE GOT A RATHER LONG SCRIPT - IT'S ONE OF THOSE THAT GREW BY AGGLOMERATION AND, WELL, IT'LL GET REWRITTEN SOMEDAY. REALLY. [WRY LOOK]. ANYWAY... IT DOES A HUGE AMOUNT OF TEXT MANGLING - ESSENTIALLY PROCESSING EMAILS AND SETTING UP THE CONTENT TO BE DISPLAYED ON THE WEB - AND WORKS WELL, BUT THERE'S BEEN ONE THING THAT I'VE WANTED IT TO DO FOR A LONG WHILE NOW, AND JUST GOT AROUND TO IMPLEMENTING: I WANT IT TO LEAVE SPECIFIC, DEMARCATED CHUNKS OF TEXT ALONE, NO PROCESSING TO BE DONE AT ALL. WHAT I'VE DONE IS TO FIND THESE CHUNKS, EXTRACT THEM, AND PUSH THEM ONTO AN ARRAY, THEN REPLACE THEM WITH NUMBERED ANCHORS (e.g., "XXX_REINSERT{12}_XXX" - '12' is the index within that array). I THEN DO THE PROCESSING, AND - OBVIOUSLY - REPLACE THE ANCHORS WITH T +HE "HELD BACK" BITS. THE CODE IS REASONABLY OBVIOUS - ALTHOUGH I ENDED UP USING A BUNCH OF "SUBSTR"S INSTEAD OF 'S///' FOR SEVERAL REASONS - AND I DON'T THINK IT'S WORTH POSTING HERE (unless someone wants to see it) - BECAUSE MY QUESTION IS OF A MORE GENERAL NATURE. HERE IT IS: GIVEN THAT THE LENGTH OF THE OVERALL STRING (the email body) IS GOING TO BE CHANGED ARBITRARILY, AND THAT THE WHOLE TEXT-MANGLING ROUTINE IS BIG ENOUGH THAT I WANT TO MINIMIZE THE NUMBER OF PASSES (i.e., I don't want to run it on the multiple "interleaved" chunks between the 'raw' bits), IS THERE A BETTER PROGRAMMATIC APPROACH THAN ANCHORS OF THIS SORT? THIS APPROACH SEEMS RATHER CRUDE, AND HAS AN OBVIOUS, ALTHOUGH RATHER EASILY AVOIDABLE FAILURE MODE (what if there's a line in the text that actually says 'XXX_REINSERT_"-whatever?), AND I'D LIKE TO SEE IF MY FELLOW MONKS HAVE SOME WISDOM TO SHARE ON THIS ISSUE. THANKS IN ADVANCE!
In reply to Re: Anchors, bleh :(
by BrowserUk
in thread Anchors, bleh :(
by oko1
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |