in reply to More efficient way to truncate long strings of the same character

It's my understanding that back references (\1) are a bit slow. What follows are solutions that don't use back references. You'll have to benchmark them to see if they're faster.

#my @chars = grep !$seen{$_}++, $text =~ /./g; my @chars = 'a'..'z'; my ($re) = map qr/$_/, join '|', map "(?<=${_}{3})$_+", map quotemeta, @chars; $text =~ s/$re//g;
#my @chars = grep !$seen{$_}++, $text =~ /./g; my @chars = 'a'..'z'; my ($re) = map qr/$_/, join '|', map "${_}{4,}", map quotemeta, @chars; $text =~ s/($re)/substr($1,0,3)/eg;
#my @chars = grep !$seen{$_}++, $text =~ /./g; my @chars = 'a'..'z'; my ($re) = map qr/$_/, join '|', map "${_}{4,}", map quotemeta, @chars; $text =~ s/$re/substr($text,$-[0],3)/eg;

Update: Fixed bug identified in reply.

Replies are listed 'Best First'.
Re^2: More efficient way to truncate long strings of the same character
by GrandFather (Saint) on Oct 30, 2008 at 20:36 UTC

    map "(?<=$_{3})$_+", should be map "(?<=${_}{3})$_+",.


    Perl reduces RSI - it saves typing

      It's not necessary.

      >perl -e"$_='a'; print qr/$_{3}/" (?-xism:a{3})

      And beyond being unnecessary, it will never help either. If you were to do ${_}{...} to prevent $_{...} from beint treated as a hash element, it still wouldn't do what you want. See Re: Of scalars, hashes, quantifiers, and regexen.

        So when I run:

        use strict; use warnings; my ($ikegamiRe1) = map qr/$_/, join '|', map "(?<=$_{3})$_+", map quot +emeta, 1 .. 9;

        I should ignore:

        Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4. Use of uninitialized value in concatenation (.) or string at noname.pl + line 4.

        Update: Oh, and here's an interesting result:

        use strict; #use warnings; my $str = join '|', map "(?<=${_}{3})$_+", 1 .. 2; print $str, "\n"; $str = join '|', map "(?<=$_{3})$_+", 1 .. 2; print $str;

        Prints:

        (?<=1{3})1+|(?<=2{3})2+ (?<=)1+|(?<=)2+

        Perl reduces RSI - it saves typing