You say you are quite satisfied with your current solution, but perhaps this may be of interest for future reference.

... no regex expressions, only pure textual substitutions ... substitutions of varying lengths requiring the longer ones to come first. ... no change, but the substitution itself will get padded so that it will not match subsequently ...

This raises a point I had overlooked before. It's possible to add longest-first discrimination when building an alternation. I've also made an attempt to add some acceptance of variable whitespace to the solution. There's also a feature to skip over certain phrases. This avoids the substitution of a substring with itself just to step over it in a possibly expensive no-op. (The weird capitalization is just to emphasize the substituted bits.)

c:\@Work\Perl>perl -le "use 5.010; ;; use warnings; use strict; ;; my @skip_over = ( 'every one of them', 'all in good time', ); ;; my ($skip) = map qr{ \b (?: $_ ) \b (*SKIP) (*FAIL) }xms, join q{ | }, map qr{ \Q$_\E }xms, sort { length($b) <=> length($a) } @skip_over ; print qq{\$skip: $skip \n}; ;; my %direct_substitution = ( 'every one' => 'EVERYONE', 'sore athirst' => 'Very Thirsty', 'athirst' => 'THIRSTY', 'all in' => 'GONZO', ); ;; my ($capture) = map qr{ \b (?: $_ ) \b }xms, join q{ | }, map qr{ \Q$_\E }xms, sort { length($b) <=> length($a) } keys %direct_substitution ; print qq{\$capture: $capture \n}; ;; my $line = qq{every one wang chung every one \n} . qq{of them are sore athirst if not well athirst. \n} . qq{all in good time we will enjoy all in wrestling. \n} ; print qq{[[$line]] \n}; ;; $line =~ s{ \s+ }' 'xmsg; print qq{(($line)) \n}; ;; my $count = 0; $count += $line =~ s{ ($skip | $capture) } {$direct_substitution{$1}}xmsg; ;; print qq{substitutions: $count}; print qq{<<$line>> \n}; " $skip: (?^msx: \b (?: (?^msx: every\ one\ of\ them ) | (?^msx: all\ in +\ good\ time ) ) \b (*SKIP) (* FAIL) ) $capture: (?^msx: \b (?: (?^msx: sore\ athirst ) | (?^msx: every\ one +) | (?^msx: athirst ) | (?^msx : all\ in ) ) \b ) [[every one wang chung every one of them are sore athirst if not well athirst. all in good time we will enjoy all in wrestling. ]] ((every one wang chung every one of them are sore athirst if not well +athirst. all in good time we will enjoy all in wrestling. )) substitutions: 4 <<EVERYONE wang chung every one of them are Very Thirsty if not well T +HIRSTY. all in good time we will enjoy GONZO wrestling. >>
(Some long output lines have been arbitrarily wrapped when composing this post.) Of course, I intend each  $line to be an element in an array over which you're looping.

Update: Of course, it's possible to get rid of the
    $line =~ s{ \s+ }' 'xmsg;
whitespace collapsing step and make the code even more whitespace agnostic.


Give a man a fish:  <%-{-{-{-<


In reply to Re^3: How to count substitutions on an array by AnomalousMonk
in thread How to count substitutions on an array by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.