in reply to How do i do a direct comparison of an array of strings?

If the strings/strands being compared are all the same length, bitwise-xoring the strings effectively compares them. The tr operator can then be used to flag and count positions of equality. IIRC, GrandFather had a note on this a few months back.
>perl -wMstrict -le "my ($x, $y) = qw{abXcdYef pqXrsYtu}; my $diff = $x ^ $y; $diff =~ tr{\x00-\xff}{=!}; print $diff; print 'number of differences: ', $diff =~ tr{!}{!}; my @i_diffs = do { my $i = 0; map $_->[1], grep $_->[0] eq '!', map [ $_, $i++ ], split '', $diff; }; print qq{differing positions: @i_diffs}; " !!=!!=!! number of differences: 6 differing positions: 0 1 3 4 6 7
Updates:
  1. Found GrandFather's reply: Re: match and mismatch.
  2. Changed example to actually count and report positions of differing characters per OPer's request.

Replies are listed 'Best First'.
Re^2: How do i do a direct comparison of an array of strings?
by AnomalousMonk (Archbishop) on Feb 19, 2009 at 20:27 UTC
    Actually, you don't even have to bother with the  tr stuff if you only want the differing positions:
    >perl -wMstrict -le "my ($x, $y) = qw{abXcdYef pqXrsYtu}; my @i_diffs = differing_positions($x, $y); print 'number of differences: ', scalar @i_diffs; print qq{differing positions: @i_diffs}; sub differing_positions { my $i = 0; map $_->[1], grep $_->[0] ne qq{\x00}, map [ $_, $i++ ], split '', $_[0] ^ $_[1] ; } " number of differences: 6 differing positions: 0 1 3 4 6 7

      Or even:

      sub differing_positions { my @c = unpack('C*', $_[0] ^ $_[1]) ; return grep $c[$_], (0..$#c) ; }
Re^2: How do i do a direct comparison of an array of strings?
by AnomalousMonk (Archbishop) on Feb 20, 2009 at 01:47 UTC
    A regex solution is significantly faster than a ST one.

    Some benchmarks:

    use warnings; use strict; use Benchmark qw(cmpthese); my $multiplier = 10_000; my $s__ = "abcdefghij" x $multiplier; # base string my $s10 = "Abcdefghij" x $multiplier; # 10% differ my $s50 = "ABCDEfghij" x $multiplier; # 50% differ my $s90 = "ABCDEFGHIj" x $multiplier; # 90% differ sub differ_at_st { my $i = 0; return [ map $_->[1], grep $_->[0] ne qq{\x00}, map [ $_, $i++ ], split '', $_[0] ^ $_[1] ]; } sub differ_at_rx { my $diff = $_[0] ^ $_[1]; my @diffs; push @diffs, $-[1] while $diff =~ m{ ([^\000]) }xmsg; return \@diffs; } print "multiplier == $multiplier \n"; cmpthese(-10, { st_10 => sub { differ_at_st($s__, $s10) }, st_50 => sub { differ_at_st($s__, $s50) }, st_90 => sub { differ_at_st($s__, $s90) }, rx_10 => sub { differ_at_rx($s__, $s10) }, rx_50 => sub { differ_at_rx($s__, $s50) }, rx_90 => sub { differ_at_rx($s__, $s90) }, });
    Output:
    C:\@Work\Perl\monks\745091>perl 745091_1.pl multiplier == 1000 Rate st_90 st_50 st_10 rx_90 rx_50 rx_10 st_90 10.7/s -- -0% -12% -49% -73% -94% st_50 10.7/s 0% -- -11% -49% -73% -94% st_10 12.1/s 13% 13% -- -42% -70% -93% rx_90 20.9/s 95% 95% 73% -- -48% -88% rx_50 40.0/s 274% 273% 230% 91% -- -77% rx_10 177/s 1562% 1555% 1367% 750% 344% -- C:\@Work\Perl\monks\745091>perl 745091_1.pl multiplier == 10000 Rate st_90 st_50 st_10 rx_90 rx_50 rx_10 st_90 1.03/s -- -2% -11% -50% -73% -94% st_50 1.05/s 2% -- -9% -49% -72% -94% st_10 1.15/s 13% 10% -- -44% -69% -93% rx_90 2.05/s 100% 96% 78% -- -45% -88% rx_50 3.76/s 267% 260% 226% 83% -- -77% rx_10 16.6/s 1513% 1483% 1334% 707% 340% --