in reply to Regex overlap in MAC address

0:13:46:b:4:31 ^^ First pass ^^^^ Second pass ^^^ Third pass

You're expecting the regexp to match the same : twice. Use a zero-width lookahead to check for the trailing : or end of string.

$mac =~ s/(^|:)([0-9a-fA-F])(?=:|$)/${1}0$2/g;

Better yet, eliminate needless copying.

$mac =~ s/(^|:)(?=[0-9a-fA-F](?::|$))/${1}0/g;

Update: Tested. Fixed bug where : would get removed.

Update: If you really need speed, the following is much faster, but it sacrifices readability:

$mac =~ s/(?<=:)(?=[0-9a-fA-F](?::|$))/0/g; $mac =~ s/^(?=[0-9a-fA-F](?::|$))/0/;
Rate orig fast orig 36251/s -- -42% fast 62533/s 72% --

Replies are listed 'Best First'.
Re^2: Regex overlap in MAC address
by Skeeve (Parson) on Sep 11, 2006 at 20:06 UTC
    Hi ikegami could you please rate this too?
    $mac= join ":",map substr("00$_",-2),split/:/,$mac;

    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

      Sure thing

      use Benchmark qw( cmpthese ); our $data = '0:13:46:b:4:31'; cmpthese(-3, { ike_orig => 'my $mac = $data; $mac =~ s/(^|:)(?=[0-9a-fA-F](?::|$)) +/${1}0/g; $mac;', ike_fast => 'my $mac = $data; $mac =~ s/(?<=:)(?=[0-9a-fA-F](?::|$) +)/0/g; $mac =~ s/^(?=[0-9a-fA-F](?::|$))/0/; + $mac;', skeeve => 'my $mac = $data; $mac = join ":", map substr("0$_",-2) +, split/:/, $mac; $mac;', jwkrahn1 => 'my $mac = $data; $mac =~ s/([^:]+)/ sprintf "%02s", $1 + /eg; $mac;', jwkrahn2 => 'my $mac = $data; $mac = join ":", map sprintf( "%02s", + $_ ), split /:/, $mac; $mac;', });
      Rate ike_orig jwkrahn1 skeeve jwkrahn2 ike_fast ike_orig 37908/s -- -3% -29% -31% -43% jwkrahn1 38957/s 3% -- -27% -29% -41% skeeve 53318/s 41% 37% -- -3% -20% jwkrahn2 54879/s 45% 41% 3% -- -18% ike_fast 66551/s 76% 71% 25% 21% --
        Just a side note about benchmarking... It depends on the platform! See the result of your code on my Mac:
                    Rate jwkrahn1  skeeve2 jwkrahn2   skeeve ike_orig ike_fast
        jwkrahn1 49277/s       --     -11%     -15%     -17%     -20%     -50%
        skeeve2  55160/s      12%       --      -5%      -7%     -10%     -44%
        jwkrahn2 58151/s      18%       5%       --      -2%      -5%     -41%
        skeeve   59291/s      20%       7%       2%       --      -3%     -40%
        ike_orig 61250/s      24%      11%       5%       3%       --     -38%
        ike_fast 98301/s      99%      78%      69%      66%      60%       --

        s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
        +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
        how about trying these? just playing with your fastest and adding sentinal ':'s before processing and removing them after.
        zen3 => 'my $mac = ":".$data.":"; $mac =~ s/(?<=:)(?=[0-9a-fA-F](?: +:))/0/g; $mac =~ s/^:|:$//; $mac;', zen4 => 'my $mac = ":$data:"; $mac =~ s/(?<=:)(?=[0-9a-fA-F](?::))/ +0/g; $mac =~ s/^:|:$//; $mac;', zen5 => 'my $mac = ":$data:"; $mac =~ s/(?<=:)(?=[0-9a-fA-F](?::))/ +0/g; $mac = substr $mac, 1,-1; $mac;',
        my results are varied:
        Rate zen2 zen3 ike_fast zen4 zen5 zen2 152990/s -- -1% -1% -4% -6% zen3 154915/s 1% -- -0% -3% -5% ike_fast 155051/s 1% 0% -- -3% -5% zen4 159115/s 4% 3% 3% -- -2% zen5 162584/s 6% 5% 5% 2% -- Rate zen2 ike_fast zen4 zen3 zen5 zen2 147082/s -- -3% -3% -3% -5% ike_fast 150866/s 3% -- -0% -1% -3% zen4 151345/s 3% 0% -- -1% -3% zen3 152312/s 4% 1% 1% -- -2% zen5 155404/s 6% 3% 3% 2% -- Rate zen2 ike_fast zen4 zen5 zen3 zen2 132209/s -- -1% -4% -10% -13% ike_fast 134054/s 1% -- -3% -9% -11% zen4 137554/s 4% 3% -- -6% -9% zen5 147053/s 11% 10% 7% -- -3% zen3 151234/s 14% 13% 10% 3% -- Rate zen2 ike_fast zen3 zen4 zen5 zen2 148516/s -- -2% -3% -4% -7% ike_fast 151827/s 2% -- -1% -1% -5% zen3 153467/s 3% 1% -- -0% -4% zen4 153946/s 4% 1% 0% -- -4% zen5 159757/s 8% 5% 4% 4% --