nate_ has asked for the wisdom of the Perl Monks concerning the following question:

Hey everyone,

I'm just learning regex's and my script works, but it's pretty nasty and I know there is a more efficient way of doing it. Here's the example:

$input = "AABBCC";
$input=~s/A/1000 /g;
$input=~s/B/2000 /g;

I'm searching through the string and replacing with a decimal value, which I'm using the split function on the space later in my script. Is there a way to write one regex that will search for each different letter and replace it with a different value? I've searched around but can't find anything that good.

Thanks in advance.

Replies are listed 'Best First'.
Re: regex question
by jwkrahn (Abbot) on Sep 02, 2009 at 23:18 UTC

    You say you're use split later so I'll assume you want an array as a result:

    my %convert = ( A => 1000, B => 2000, C => 3000, ); my @data = map $convert{ $_ }, $input =~ /[ABC]/g;
Re: regex question
by Marshall (Canon) on Sep 03, 2009 at 00:00 UTC
    If you split on "nothing", // this yields each individual character in that string. Simple way to do replacement would be a "look-up" table to see if something should be substituted for that character. Some slight efficiency could be gained with "if..else", but below demonstrates the main point...Adapt as you will...key thing is the "split" to get each character in line, then use hash to look up substitute number.
    #!/usr/bin/perl -w use strict; my %xlate = ('A' => 1000, 'B' => 2000, 'X' => 9999, ); while (<DATA>) { chomp; foreach my $ltr (my @ltrs = split(//,$_)) { print "$xlate{$ltr} " if $xlate{$ltr}; print "$ltr " if !$xlate{$ltr}; } print "\n"; } __END__ Prints: 1000 1000 2000 2000 C C 1000 C 2000 9999 1000 __DATA__ AABBCC ACBXA
Re: regex question
by ikegami (Patriarch) on Sep 02, 2009 at 23:25 UTC

    [ Ignore this post, I missed how you just want a list of the numbers at the end ]

    my %value_lkup = ( A => '1000', B => '2000', ); my $pat = '[' . join('', keys %value_lkup) . ']'; s/($pat)/$value_lkup{$1}/g;

    or

    s/([AB])/ ( ord($1)-ord('A')+1 )*1000 /eg;

      That's actually a really nice way to use a hash to look through a string and replace keys with values.

      You could even generalize it to multi-character keys:

      my %value_lkup = ( A => '1000', B => '2000', CDE => '3000', ); my $pat = '(?:' . join('|', keys %value_lkup) . ')'; s/($pat)/$value_lkup{$1}/g;

        To make it a general solution, you need to add quotemeta and you gotta be careful about ordering if one key can be the the start of another key. And if you're going to add (?:), you might as well use qr//.

        my ($pat) = map qr/$_/, join '|', map quotemeta, sort { length($b) <=> length($a) } keys %value_lkup; my @data = map $value_lkup{$_}, /$pat/g;
Re: regex question
by nate_ (Initiate) on Sep 03, 2009 at 16:34 UTC
    Hey Everyone,

    Thanks for your input, I really appreciate it. I'm using a hash as suggested and it's working great. Hopefully it's running like a "raped ape" :)