Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I visited the Rosetta Code, for solution to a problem in a language, which both don't matter here; then checked Perl solution there; then, out of idle curiosity, as it's usual with time wasting e.g. browsing dictionaries, clicked for one more task; and -- I swear (and don't know what in its title attracted me)-- at exactly the 3d task I found a Perl solution:

https://rosettacode.org/wiki/Change_e_letters_to_i_in_words#Perl

Out of ALL contestants, the task was misread to include 5-letter words, and even then, the algorithm is broken as it doesn't contain e.g. "crises crisis" pair, etc. So sad :-(. Village idiot, our Perl, no less. No, I don't have editing rights there, nor did I find in page's history how/when it was added.

However, I've been surprised to find it hard and not obvious (to me), how to match the solution's speed/memory. Presumably, if corrected they'd stay the same or close. My initial attempts with e.g. grep/map/split/regexes or what not -- they all were worse.

Finally, here's my solution (under Strawberry):

use strict; use warnings; sub memory { qx( tasklist /nh /fi "PID eq $$" ) =~ m[(\S+ K)$] } use Time::HiRes 'time'; my $t = time; my ( @a, %h ); open my $fh, '<:raw', 'unixdict.txt' or die; while ( <$fh> ) { next unless length > 6; chomp; ( -1 != index $_, 'e' ) ? ( push @a, $_ ) : ( -1 != index $_, 'i' ) ? ( $h{ $_ } = 1 ) : 1 } close $fh; my ( $s, $i ) = ''; exists $h{ $i = tr/e/i/r } and $s .= sprintf "%30s %s\n", $_, $i for @a; print time - $t, "\n"; print memory, "\n"; print $s;

It's on average "0.018 s, 10,100 K" vs "0.033 s, 11,100 K" of similarly modified (but still unfixed) original. Not that "performance" matters for the task, how ever ridiculous this achievement is.

Replies are listed 'Best First'.
Re: Perl at Rosetta Code, with one particular example
by tybalt89 (Monsignor) on Jun 12, 2025 at 01:27 UTC

    Two things (so far).
    Slurping a file and regexing out what you want is faster than reading the file line by line in a while loop (in my experience).
    Hash slices are faster than separate lookups (ditto).

    My general rule for speed is to try to do as much as you can in the perl interpreter, as opposed to the perl language. See Re: converting binary to decimal as an example. Try to do operations on groups of things, not the individual elements.

    Recheck the RosettaCode page :)

      Thanks for fixing the RC; it's slower; + RAM now at ~13.5 Mb vs. mine of ~10 Mb according to my crude measurements.

      use strict; use warnings; use Benchmark 'cmpthese'; cmpthese -3, { am => sub { my ( @a, %h ); open my $fh, '<:raw', 'unixdict.txt' or die; while ( <$fh> ) { next unless length > 6; chomp; ( -1 != index $_, 'e' ) ? ( push @a, $_ ) : ( -1 != index $_, 'i' ) ? ( $h{ $_ } = 1 ) : 1 } close $fh; my ( @ret, $i ); exists $h{ $i = tr/e/i/r } and push @ret, sprintf "%30s %s\n", $_, $i for @a; @ret }, rc => sub { my $file = do { local (@ARGV, $/) = 'unixdict.txt'; <> }; my %i = map { tr/e/i/r => sprintf "%30s %s\n", $_, tr/e/i/r } $file =~ /^(?=.{6}).*e.*$/gm; @i{ split ' ', $file }; } } __END__ Rate rc am rc 25.2/s -- -65% am 71.1/s 182% --
Re: Perl at Rosetta Code, with one particular example
by ysth (Canon) on Jun 13, 2025 at 02:50 UTC
    I read the problem as replacing one or more e's, not all e's.

    That gets these additions:

    becker -> bicker complement -> compliment complementary -> complimentary empress -> impress endorse -> indorse enfield -> infield enquire -> inquire enviable -> inviable freedman -> friedman freeze -> frieze kenney -> kinney kettle -> kittle meddle -> middle peddle -> piddle redden -> ridden semper -> simper whether -> whither
    FWIW, my first pass at this was
    perl -E'chomp(my @w=grep /[ie]/ && length > 5, readline); for my $w (r +everse @w) { say "$w -> $_" for grep defined, @w{glob $w =~ s/e/{e,i +}/r}; $w{$w}=$w }' unixdict.txt

      There are a few more with (missing) "g" modifier; and s/5/6/; anyway it gets above 1 s for me. OTOH (definitely not a one-liner though):

      # Rate am_new # am_new 40.9/s -- am_new => sub { my ( @a, %h ); open my $fh, '<:raw', 'unixdict.txt' or die; while ( <$fh> ) { next unless length > 6; chomp; push @a, $_ if -1 != index $_, 'e'; next unless -1 != index $_, 'i'; my $key = tr/e/i/r; $h{ $key } = $_ unless exists $h{ $key } and $_ lt $h{ $key } } close $fh; my ( @ret, $key ); exists $h{ $key = tr/e/i/r } and $h{ $key } gt $_ and push @ret, sprintf "%30s %s\n", $_, $h{ $key } for @a; @ret },

      And, e.g. for (some or all of) "e => a" (not "i") then "gt", "lt" should be swapped.

        Oops, yes, missing g. But did they change the problem? I see "The length of any word shown should have a length > 5."

        $h{ $key = tr/e/i/r } is wrong since $_ not set there. Perhaps the = should be a =~?

        (Nevermind)

Re: Perl at Rosetta Code, with one particular example
by LanX (Saint) on Jun 13, 2025 at 16:30 UTC
    The task

    > The length of any word shown should have a length > 5

    Your code line 11

    >  next unless length > 6;

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery