in reply to eliding number ranges

#!/usr/bin/perl use strict; use warnings; while (<DATA>) { chomp; my ($f, $s) = split; unless (length ($f) == length ($s)) { print "$f-$s\n"; next; } if ($f == $s) { print $f, "\n"; next; } print "$f-", substr ($s => length +(("$f" ^ "$s") =~ /^(\x00*)/) [ +0]), "\n"; } __DATA__ 1 32 4 19 28 39 34 123 321 321 324 329 325 349 340 509

Abigail

Replies are listed 'Best First'.
Golf: eliding number ranges
by Roy Johnson (Monsignor) on Nov 14, 2003 at 16:28 UTC
    print"$f-",substr($s,length+(("$f"^"$s")=~/(\0*)/)[0]),"\n";
    can be reduced to
    print"$f-",substr($s,/\0*/g&&pos),"\n"for("$f"^"$s");
Re: Re: eliding number ranges
by Anonymous Monk on Nov 13, 2003 at 15:41 UTC
    Can you please explain
    print "$f-", substr ($s => length +(("$f" ^ "$s") =~ /^(\x00*)/) [0]), + "\n";

      Earlier in the subroutine, values of $f and $s where the length differed were removed. Also, if $f and $s are the same value, they are removed as well. Then comes the line:

      print "$f-", substr ($s => length +(("$f" ^ "$s") =~ /^(\x00*)/) [0]), + "\n";

      What's happening here is that an exclusive or is being applied to both terms. As you might remember from your basic discrete math class, exclusive or returns 1 if the two bits are different and 0 if they are the same. This is why the first and second conditionals are important as we can now guarantee that you won't get 28-976 when $f = 28 and $s = 28976 and we won't get 28- when $f = $s = 28. Now comes the magic part. length (...)[0] is the same as length(...)[0]. Since length only returns a scalar, length(...)[0] makes no sense. Thus the reason for a plus (+) followed by parentheses to give the compiler a hint as to what we actually mean. The parentheses give the match list context with the [0] indicating we want the first captured element of the match (note, if we didn't capture, the regex would only return 1 or boolean true). The parentheses around the "$f" ^ "$s" are there only because =~ has higher precedence than ^. The regex matches the null character (\x00) which in binary looks like 00000000. Since our exclusive or tells us where the string matches with a 0 and where it doesn't match with a 1, our null characters mean we match at those locations. Because our regex capture returns the actual match and since our match is greedy, we will get a string of \x00 returned with a length the same as the beginning matching areas of $f and $s. Then length takes over to return the length. We then have substr($s,length) which returns the substr of $s from position length to the end. Thus we end up printing out $f-(the non-matching portion of $s).

      Sorry if I messed up anywhere. Hope this helps.

      Thanks to PodMaster for pointing out that I put exclusive or as 1 if they are different and 1 if they are the same. Doh!

      antirice    
      The first rule of Perl club is - use Perl
      The
      ith rule of Perl club is - follow rule i - 1 for i > 1

      Taking the xor (^) of two strings will yield nulls (\x00) in every character position where the strings have the same character.

      The regex matches the leading nulls, and length counts them. substr skips that many chars in $s and prints the rest.

      I wrote this to help me understand Abigail's code. I'd never seen the xor-string trick before.

      #!perl use strict; use warnings; while (my $line=<DATA>) { chomp $line; my ($f, $s) = split /\s+/,$line; unless (length ($f) == length ($s)) { print "\"$line\" -> $f-$s\n"; next; } if ($f == $s) { print "\"$line\" -> $f\n"; next; } my $first = join ' ',map {sprintf("%08b",ord)} ("$f" =~ /./g); print "$f\t= $first\n"; my $second = join ' ',map {sprintf("%08b",ord)} ("$s" =~ /./g); print "$s\t= $second\n"; my $xor = join ' ',map {sprintf("%08b",ord)} (("$f" ^ "$s") =~ /./ +g); print "XOR\t= $xor\n"; my $index_of_first_diff = 0; $index_of_first_diff++ while($xor =~ /00000000 /g); print "Index\t= " , ' 'x9x$index_of_first_diff, "$index_of_first_d +iff\n"; print "Result\t= $f-", substr ($s, $index_of_first_diff), "\n\n"; } __DATA__ 324 329 325 349 340 509

      Output -

      d:\>test.pl 324 = 00110011 00110010 00110100 329 = 00110011 00110010 00111001 XOR = 00000000 00000000 00001101 Index = 2 Result = 324-9 325 = 00110011 00110010 00110101 349 = 00110011 00110100 00111001 XOR = 00000000 00000110 00001100 Index = 1 Result = 325-49 340 = 00110011 00110100 00110000 509 = 00110101 00110000 00111001 XOR = 00000110 00000100 00001001 Index = 0 Result = 340-509