Hi, all:

I've got a rather long script - it's one of those that grew by agglomeration and, well, it'll get rewritten someday. Really. [wry look].

Anyway... it does a huge amount of text mangling - essentially processing emails and setting up the content to be displayed on the Web - and works well, but there's been one thing that I've wanted it to do for a long while now, and just got around to implementing: I want it to leave specific, demarcated chunks of text alone, no processing to be done at all.

What I've done is to find these chunks, extract them, and push them onto an array, then replace them with numbered anchors (e.g., "XXX_REINSERT{12}_XXX" - '12' is the index within that array). I then do the processing, and - obviously - replace the anchors with the "held back" bits.

The code is reasonably obvious - although I ended up using a bunch of "substr"s instead of 's///' for several reasons - and I don't think it's worth posting here (unless someone wants to see it) - because my question is of a more general nature. Here it is:

Given that the length of the overall string (the email body) is going to be changed arbitrarily, and that the whole text-mangling routine is big enough that I want to minimize the number of passes (i.e., I don't want to run it on the multiple "interleaved" chunks between the 'raw' bits), is there a better programmatic approach than anchors of this sort? This approach seems rather crude, and has an obvious, although rather easily avoidable failure mode (what if there's a line in the text that actually says 'XXX_REINSERT_"-whatever?), and I'd like to see if my fellow Monks have some wisdom to share on this issue.

Thanks in advance!


In reply to Anchors, bleh :( by oko1

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.