in reply to Re: eliding number ranges
in thread eliding number ranges

Can you please explain
print "$f-", substr ($s => length +(("$f" ^ "$s") =~ /^(\x00*)/) [0]), + "\n";

Replies are listed 'Best First'.
Re: Re: Re: eliding number ranges
by antirice (Priest) on Nov 13, 2003 at 17:14 UTC

    Earlier in the subroutine, values of $f and $s where the length differed were removed. Also, if $f and $s are the same value, they are removed as well. Then comes the line:

    print "$f-", substr ($s => length +(("$f" ^ "$s") =~ /^(\x00*)/) [0]), + "\n";

    What's happening here is that an exclusive or is being applied to both terms. As you might remember from your basic discrete math class, exclusive or returns 1 if the two bits are different and 0 if they are the same. This is why the first and second conditionals are important as we can now guarantee that you won't get 28-976 when $f = 28 and $s = 28976 and we won't get 28- when $f = $s = 28. Now comes the magic part. length (...)[0] is the same as length(...)[0]. Since length only returns a scalar, length(...)[0] makes no sense. Thus the reason for a plus (+) followed by parentheses to give the compiler a hint as to what we actually mean. The parentheses give the match list context with the [0] indicating we want the first captured element of the match (note, if we didn't capture, the regex would only return 1 or boolean true). The parentheses around the "$f" ^ "$s" are there only because =~ has higher precedence than ^. The regex matches the null character (\x00) which in binary looks like 00000000. Since our exclusive or tells us where the string matches with a 0 and where it doesn't match with a 1, our null characters mean we match at those locations. Because our regex capture returns the actual match and since our match is greedy, we will get a string of \x00 returned with a length the same as the beginning matching areas of $f and $s. Then length takes over to return the length. We then have substr($s,length) which returns the substr of $s from position length to the end. Thus we end up printing out $f-(the non-matching portion of $s).

    Sorry if I messed up anywhere. Hope this helps.

    Thanks to PodMaster for pointing out that I put exclusive or as 1 if they are different and 1 if they are the same. Doh!

    antirice    
    The first rule of Perl club is - use Perl
    The
    ith rule of Perl club is - follow rule i - 1 for i > 1

Re:* eliding number ranges
by Roy Johnson (Monsignor) on Nov 13, 2003 at 17:08 UTC
    Taking the xor (^) of two strings will yield nulls (\x00) in every character position where the strings have the same character.

    The regex matches the leading nulls, and length counts them. substr skips that many chars in $s and prints the rest.

Re: Re: Re: eliding number ranges
by EdwardG (Vicar) on Nov 13, 2003 at 18:24 UTC

    I wrote this to help me understand Abigail's code. I'd never seen the xor-string trick before.

    #!perl use strict; use warnings; while (my $line=<DATA>) { chomp $line; my ($f, $s) = split /\s+/,$line; unless (length ($f) == length ($s)) { print "\"$line\" -> $f-$s\n"; next; } if ($f == $s) { print "\"$line\" -> $f\n"; next; } my $first = join ' ',map {sprintf("%08b",ord)} ("$f" =~ /./g); print "$f\t= $first\n"; my $second = join ' ',map {sprintf("%08b",ord)} ("$s" =~ /./g); print "$s\t= $second\n"; my $xor = join ' ',map {sprintf("%08b",ord)} (("$f" ^ "$s") =~ /./ +g); print "XOR\t= $xor\n"; my $index_of_first_diff = 0; $index_of_first_diff++ while($xor =~ /00000000 /g); print "Index\t= " , ' 'x9x$index_of_first_diff, "$index_of_first_d +iff\n"; print "Result\t= $f-", substr ($s, $index_of_first_diff), "\n\n"; } __DATA__ 324 329 325 349 340 509

    Output -

    d:\>test.pl 324 = 00110011 00110010 00110100 329 = 00110011 00110010 00111001 XOR = 00000000 00000000 00001101 Index = 2 Result = 324-9 325 = 00110011 00110010 00110101 349 = 00110011 00110100 00111001 XOR = 00000000 00000110 00001100 Index = 1 Result = 325-49 340 = 00110011 00110100 00110000 509 = 00110101 00110000 00111001 XOR = 00000110 00000100 00001001 Index = 0 Result = 340-509