in reply to Anchors, bleh :(
MIME uses an idea that is kinda neat and could help here. You pick your placeholder, say "XXX", then search to see if that clashes (because it actually appears in the text already). If it does, then you look at the character after the first occurrance of it in your text and append any other character to your placeholder and search from that point again. Repeat until you reach the end of your text. You will have only traversed the text one time and when you are done you'll have a placeholder that does not appear anywhere in your text. Then you can append your sequence numbers (plus a non-digit terminator) to get your set of conflict-free placeholders.
But you might have to worry about your manipulations creating a conflict with this placeholder.
Another route would be to "escape" any occurrances of your placeholder both in the original text and in any substitutions that get applied to the text. Then unescape those after you replace the placeholders. For example:
$text =~ s/%/%%/g; # replace first block with "(%1%)" # replace second block with "(%2%)" # ... my %subs= ( replaceThis => "withThis", # ... ); for( @subs{ keys %subs } ) { s/%/%%/g; } $text =~ s/$_/$subs{$_}/g for keys %subs; # replace (%1%) with original first block # ... $text =~ s/%%/%/g;
Then you only have to worry about your manipulations accidentally changing a placeholder (which can often be easy to avoid in practice -- which it probably is in your case since you didn't appear worried about it).
the whole text-mangling routine is big enough that I want to minimize the number of passes (i.e., I don't want to run it on the multiple "interleaved" chunks between the 'raw' bits)
Your concern there appears to be one of speed of execution. You might reconsider this concern (or at least test it), as running the long mangling process several times on short strings could certainly end up not being much slower than running it once on the much longer full string.
It is certainly possible to just remove chunks from the string, note the resulting offsets to those spots, and keep running totals of how much these offsets were shifted by each substitution. But that is complex enough that it is quite easy to get it wrong, so I don't think I'd recommend that approach. And I can't think of any alternatives that are better than the above ones.
- tye
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Anchors, bleh :( (escape)
by oko1 (Deacon) on Feb 10, 2008 at 15:16 UTC |