in reply to Re^2: How to replace greedy alternation?
in thread How to replace greedy alternation?

Indeed, getting rid of the clutter first, and then remove all but the first instance of each letter:

sub reduc { my ($s) = @_ ; $s =~ tr/abcd//cd ; 1 while $s =~ s/(.).*?\K\1+//g ; return $s ; } ; print reduc('1b22a7b3d3ccaacccbcccacccddcceqcc11oabepd'), "\n" ; # badc print reduc('1b22w7b3d3ccsqcccbccc2cccddcceqcc11o9bepd'), "\n" ; # bdc
I imagine a more cunning monk can improve the reduction step -- though the amount of work it does obviously depends on how many repeated letters need to be removed.

Replies are listed 'Best First'.
Re^4: How to replace greedy alternation?
by Anonymous Monk on Jan 29, 2009 at 06:37 UTC
    This may or may not be an improvement, but it only takes one pass through the regex:
    #!/usr/bin/perl use strict; use warnings; # If all four chars are present, returns them in first-seen order. # If any of the four are not present, returns an empty string. sub four_or_nothing { my $str = shift; my @in_order = ($str =~ m{ ([abcd]) .*? (?!\1)([abcd]) .*? (?!\1|\2)([abcd]) .*? (?!\1|\2|\3)([abcd]) }x); return join("", @in_order); } foreach my $case ( '1b22a7b3d3ccaacbcaccddeqcc11oabepd', '1b22w7b3d3ccsqcbc2ccddeqcc11o9bepd', ) { printf qq{%s: "%s"\n}, $case, four_or_nothing($case); } __DATA__ 1b22a7b3d3ccaacbcaccddeqcc11oabepd: "badc" 1b22w7b3d3ccsqcbc2ccddeqcc11o9bepd: ""

      How wonderful ! And another regex trick learned :-)

      Though it goes bloomin' berserk backtrackin' when all four of 'abcd' are not present ! The following tweak improves:

      my @in_order = ($str =~ m{ ^(?> .*? ([abcd]) ) (?> .*? (?!\1) ([abcd]) ) (?> .*? (?!\1|\2) ([abcd]) ) (?> .*? (?!\1|\2|\3) ([abcd]) ) }x);
      And, of course, YAW:
      sub four_or_less { my ($s) = @_ ; $s =~ tr/abcd//cd ; my $c = '' ; $c .= $1 while ($s =~ m/([^ $c])/g) ; return $c ; }