http://qs1969.pair.com?node_id=11129290

hennesse has asked for the wisdom of the Perl Monks concerning the following question:

I sometimes have to do this:
$string =~ s/zero/0/ig; $string =~ s/one/1/ig; $string =~ s/two/2/ig; $string =~ s/three/3/ig;

This is cool if there are just a few, but not if a lot. Is there a more compact way to do this?

i.e. I need a "trans-word-ilation" operator, similar to the "transliteration" opeerator

Thanks - Dave

Replies are listed 'Best First'.
Re: Combining multiple =~ s/
by salva (Canon) on Mar 07, 2021 at 19:05 UTC
    Let Perl generate the regular expression from a hash containing the list of words and their replacements:
    my %trans = (zero => 0, one => 1, two => 2, three => 3); my $words_re = join "|", map quotemeta, keys %trans; $string =~ s/($words_re)/$trans{lc $1}/ig;

      You want to sort the words in the regular expression by descending length :) Or alternatively, use \b to match only whole words:

      my %trans = (zero => 0, one => 1, ones => 11, two => 2, three => 3); my $words_re = join "|", map quotemeta, sort { length($b) <=> length($ +a) } keys %trans; $string =~ s/\b($words_re)\b/$trans{lc $1}/ig;

      Otherwise, it could be that one matches before ones.

      Update: choroba spotted that the length($b) and length($a) were missing

        It might be worth pointing out the CPAN module Data::Munge which has a function list2re to make a list-matching regex. This line is basically what you need:

        my $re = join '|', map quotemeta, sort {length $b <=> length $a || +$a cmp $b } @_;

        There is also this, which is what I use.

      Nitpick! :)

      This is semantically not the same, even if sorted.

      Tho probably what the OP wants, if it's really about numbers

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

Re: Combining multiple =~ s/
by haukex (Archbishop) on Mar 07, 2021 at 20:24 UTC
Re: Combining multiple =~ s/
by kcott (Archbishop) on Mar 08, 2021 at 06:45 UTC

    G'day Dave,

    "This is cool if there are just a few, but not if a lot. Is there a more compact way to do this?"

    If there are "just a few", you can still make this more compact by chaining non-destructive substitution operations. That was introduced in 5.14; see "perl5140delta: Non-destructive substitution".

    $string =~ s/zero/0/igr =~ s/one/1/igr =~ s/two/2/igr =~ s/three/3/igr +;

    If "a lot", then earlier advice regarding a pattern with alternation and a replacement using a lookup table, would be my choice.

    In either case, use a boundary assertion as already described; avoid s/bones/b1s/ and similar mistakes.

    — Ken

Re: Combining multiple =~ s/
by LanX (Saint) on Mar 07, 2021 at 20:50 UTC
    the exact translation is in a loop

    DB<1> @a= qw/zero one two three four/ DB<2> $pi ='three . one four one' DB<3> $pi =~ s/$a[$_]/$_/ig for 0..4 DB<4> p $pi 3 . 1 4 1 DB<5>

    Please note that replacing all the "one" before replacing "two" might change the input in unexpected ways.

    It really depends if you are really replacing numbers ...

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: Combining multiple =~ s/
by rsFalse (Chaplain) on Mar 11, 2021 at 08:28 UTC
    One more way.
    #!/usr/bin/perl use warnings; use strict; my %huge_trans = ( zero => 0, one => 1, ones => 11, two => 2, three => 3, # ... million => 1_000_000, ); my $rx_word = qr/\b\w+\b/i; print <> =~ s!($rx_word)! $huge_trans{lc $1} // $1 !ger;
    Imagine you have huge dictionary for translation. Then this way should work faster.
    Previously suggested way with alternations should also work fast, because Trie-optimization should kick-in.

    Upd. BTW, it is not about "combining regex'es", it's only a use of a hash. All words, i.e. hash keys, must be simple words, not regexes.