albert has asked for the wisdom of the Perl Monks concerning the following question:

I have pairs of same-lengthed strings which I am counting the orientation of the paired characters via bitwise operators. (Based on a great suggestion way back in time from BrowserUk: Re: Speeding permutation counting).

The previous solution works great in my application, but I'd now like to make the solution more flexible. I'm tring to do this with eval for a tr expression, and setting counted terms from variable. However, I'm doing something wrong, as I'm not able to return the value from the eval. Here is a sample of my code.

my @num = ( "0000", "0101", "1010", "1111" ); my ( $n1, $n2 ) = ( "0", "1" ); my $ndiff = abs( $n2 - $n1 ); for ( my $i = 0 ; $i < @num ; $i++ ) { for ( my $j = $i + 1 ; $j < @num ; $j++ ) { # Count oriented matches and mismatches along string print join( "\t", $num[$i], $num[$j], ( ( $num[$i] | $num[$j] ) =~ tr[0][0] ), #count 00 ( ( ~$num[$i] & $num[$j] ) =~ tr[\1][\1] ), #count 01 ( ( $num[$i] & ~$num[$j] ) =~ tr[\1][\1] ), #count 10 ( ( $num[$i] & $num[$j] ) =~ tr[1][1] ) #count 11 ), "\n"; # Attempt at flexible version via eval my ( $string1, $string2 ) = ( $num[$i], $num[$j] ); print join( "\t", $string1, $string2, ( eval "( $string1 | $string2 ) =~ tr[$n1][$n1]" ), ( eval "( ~$string1 & $string2 ) =~ tr[\$ndiff][\$ndiff]" +), ( eval "( $string1 & ~$string2 ) =~ tr[\$ndiff][\$ndiff]" +), ( eval "( $string1 & $string2 ) =~ tr[$n2][$n2]" ) ), "\n\n"; } } __END__ # Output from example # In each pair of lines, the first is the desired. 0000 0101 2 2 0 0 0000 0101 0 0 0 0 0000 1010 2 2 0 0 0000 1010 2 0 0 0 0000 1111 0 4 0 0 0000 1111 0 0 0 0 0101 1010 0 2 2 0 0101 1010 1 0 0 0 0101 1111 0 2 0 2 0101 1111 0 0 0 0 1010 1111 0 2 0 2 1010 1111 1 0 0 0
I'm fairly sure that there may be a trivial solution to this, but I'm not versed enough with eval to figure out how to get the flexible version working. Any help in improving this is greatly appreciated.

-albert

Replies are listed 'Best First'.
Re: Returning transliteration from eval
by BrowserUk (Patriarch) on Jan 28, 2011 at 17:52 UTC

    I think your question has been answered, but it is worth pointing out that by using eval you are throwing away all of the performance that was gained by the original move to using tr///.

    In almost all cases when the charsets involved in a tr/// are determined at runtime, it is better to eval a subroutine into existsance that perform the required transliteration and then reuse that, rather than re-evaling the tr/// each time you use it.

    To emphasis this point, the following benchmark counts the number of 1s in all the integers from 1 to 1 million.

    The first method evals a sub into existance to do the counting then calls it 1e6 times. The second uses your method of evaling the entire counting expression 1 million times.

    The results show that your method takes 65x longer.

    #! perl -slw use strict; use Benchmark qw[ cmpthese ]; cmpthese -1, { one_eval => q[ my $t = 1; eval qq[ sub count { \$_[0] =~ tr[$t][$t] } ]; my $c = 0; $c += count( $_ ) for 1 .. 1e6; print $c; ], many_evals => q[ my $t = 1; my $c = 0; $c += eval qq[ \$_ =~ tr[$t][$t] ] for 1 .. 1e6; print $c; ], }; __END__ C:\test>junk30 600001 600001 (warning: too few iterations for a reliable count) 600001 Subroutine count redefined at (eval 2000016) line 1. 600001 Subroutine count redefined at (eval 2000017) line 1. 600001 Subroutine count redefined at (eval 2000020) line 1. 600001 (warning: too few iterations for a reliable count) s/iter many_evals one_eval many_evals 27.3 -- -98% one_eval 0.416 6451% --

    When you use optimisations, it is very important that you understand how they work, otherwise you end up pessimising your code relative to more traditional techniques:

    #! perl -slw use strict; use Benchmark qw[ cmpthese ]; cmpthese -1, { one_eval => q[ my $t = 1; eval qq[ sub count { \$_[0] =~ tr[$t][$t] } ]; my $c = 0; $c += count( $_ ) for 1 .. 1e6; print $c; ], many_evals => q[ my $t = 1; my $c = 0; $c += eval qq[ \$_ =~ tr[$t][$t] ] for 1 .. 1e6; print $c; ], loop => q[ my $t = 1; my $c = 0; for my $n ( 1 .. 1e6 ) { ++$c while $n =~ m[$t]g; } print $c; ], }; __END__ C:\test>junk30 600001 600001 600001 (warning: too few iterations for a reliable count) 600001 600001 (warning: too few iterations for a reliable count) 600001 Subroutine count redefined at (eval 2000022) line 1. 600001 Subroutine count redefined at (eval 2000023) line 1. 600001 Subroutine count redefined at (eval 2000026) line 1. 600001 (warning: too few iterations for a reliable count) Rate many_evals loop one_eval many_evals 3.76e-002/s -- -97% -98% loop 1.17/s 3002% -- -52% one_eval 2.41/s 6303% 106% --

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Nit:

      ++$c while $n =~ m[$t]g;

      is a little faster as

      $c += () = $n =~ m[$t]g;

      But it's not fast enough to overtake one_eval.

        Maybe a 64/32-bit difference? But that works out marginally slower on my machine:

        #! perl -slw use strict; use Benchmark qw[ cmpthese ]; cmpthese -1, { one_eval => q[ my $t = 1; eval qq[ sub count { \$_[0] =~ tr[$t][$t] } ]; my $c = 0; $c += count( $_ ) for 1 .. 1e6; print $c; ], many_evals => q[ my $t = 1; my $c = 0; $c += eval qq[ \$_ =~ tr[$t][$t] ] for 1 .. 1e6; print $c; ], loop => q[ my $t = 1; my $c = 0; for my $n ( 1 .. 1e6 ) { ++$c while $n =~ m[$t]g; } print $c; ], loop2 => q[ my $t = 1; my $c = 0; for my $n ( 1 .. 1e6 ) { $c += () = $n =~ m[$t]g; } print $c; ], }; __END__ C:\test>junk30 600001 600001 600001 (warning: too few iterations for a reliable count) 600001 600001 600001 (warning: too few iterations for a reliable count) 600001 600001 (warning: too few iterations for a reliable count) 600001 Subroutine count redefined at (eval 2000028) line 1. 600001 Subroutine count redefined at (eval 2000029) line 1. 600001 Subroutine count redefined at (eval 2000032) line 1. 600001 (warning: too few iterations for a reliable count) Rate many_evals loop2 loop one_eval many_evals 3.68e-002/s -- -97% -97% -98% loop2 1.06/s 2777% -- -7% -57% loop 1.14/s 3006% 8% -- -53% one_eval 2.44/s 6515% 130% 113% --

        Another reason for not using it, is that it gets exponentially slower as the number of hits increases, due to the need to allocate large lists which then get discarded:

        perl -MTime::HiRes=time -E"$m=1e5;$x='x'x$m; $t=time; $n=()=$x=~m[x]g; say +(time()-$t)/$m" 1.44867897033691e-006 perl -MTime::HiRes=time -E"$m=1e6;$x='x'x$m; $t=time; $n=()=$x=~m[x]g; say +(time()-$t)/$m" 1.46089999675751e-005

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Returning transliteration from eval
by Anonyrnous Monk (Hermit) on Jan 28, 2011 at 15:30 UTC
    ( eval "( $string1 | $string2 ) =~ tr[$n1][$n1]" ), ( eval "( ~$string1 & $string2 ) =~ tr[\$ndiff][\$ndiff]" +), ( eval "( $string1 & ~$string2 ) =~ tr[\$ndiff][\$ndiff]" +), ( eval "( $string1 & $string2 ) =~ tr[$n2][$n2]" ) ),

    You need more backslashes :)

    ( eval "( \$string1 | \$string2 ) =~ tr[$n1][$n1]" ), ( eval "( ~\$string1 & \$string2 ) =~ tr[\\$ndiff][\\$ndif +f]" ), ( eval "( \$string1 & ~\$string2 ) =~ tr[\\$ndiff][\\$ndif +f]" ), ( eval "( \$string1 & \$string2 ) =~ tr[$n2][$n2]" ) ),

    In other words, you don't want $string1 and $string2 to be interpolated, but you do want $ndiff interpolated, and prefixed with a backslash, which itself must be escaped.

    As you have it, you're eval'ing code like

    ~0000 & 0101 ) =~ tr[$ndiff][$ndiff]

    but you'd want

    ~$string1 & $string2 ) =~ tr[\1][\1]

    In case of doubt, when string eval doesn't do what you expect, for debugging purposes always print out the string to be eval'ed...

      Thanks. I guessed it was a backslash issue, but hadn't stumbled to that one.

      You've made a light bulb go off, and I'm much clearer on eval now. Point on printing out the eval is well-taken, and will use it in the future.

        Habitual backslasher ikegami might have something to say about it but single (non-interpolating/literal) quotes can be a bit easier to read.

        ( eval '( $string1 | $string2 ) =~ tr[$n1][$n1]' ), ( eval '( ~$string1 & $string2 ) =~ tr[$ndiff][$ndiff]' ), ( eval '( $string1 & ~$string2 ) =~ tr[$ndiff][$ndiff]' ), ( eval '( $string1 & $string2 ) =~ tr[$n2][$n2]' ) ),

        Update: my eyes are getting old. I missed the interpolation in the trs.

Re: Returning transliteration from eval
by ambrus (Abbot) on Jan 29, 2011 at 18:40 UTC