Here's another way to do it, with distinct patterns instead of a hash map:

while( m/ \G .*? (?= \\[FSTRE]\\ ) /gx ) { my $pos = pos; s{ \G \\F\\ }{ "|" }egx or s{ \G \\S\\ }{ "^" }egx or s{ \G \\T\\ }{ "&" }egx or s{ \G \\R\\ }{ "~" }egx or s{ \G \\E\\ }{ "\\" }egx; pos() = $pos + 1; }

In this code, m//g does the actual work of finding the control sequences in the string. The trick is to anchor all patterns with \G, which makes sure each pattern starts off where the last successful pattern stopped matching. That way you go through the string left-to-right.

In case of s///g, this means s///g will usually be replacing exactly one single occurence, despite the /g. \G \\S\\ will only match multiple times if the control sequences appears multiple times back-to-back as in the string \S\\S\\S\.

Unfortunately, if a s///g matches at least once, it also resets the end-of-last-successful-match position when it eventually fails. Therefore, the next match would normally start over at the beginning of the string, leading to recursive replacement problems with \E\S\ getting doubly translated to ^ instead of becoming just \S\ as per the spec. That explains the manual bookkeeping with pos, which queries and sets that position. With m//, this is elegantly avoidable by use of the /c modifier which means "do not reset end-of-last-position on failure".

All that said and done, for this problem, this solution is both much less efficient and harder to understand than the hash map based ones given by others. I post this merely as a trivial demonstration of \G, which really shines when the patterns you want to match in a coordinated fashion are very non-uniform, unlike the ones in this case. m//gc is how you build true parsers in Perl.

Makeshifts last the longest.


In reply to Re: Pattern Matching, left-to-right by Aristotle
in thread Pattern Matching, left-to-right by TrekNoid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.