I was looking at this node before Christmas and decided to benchmark the various solutions. However, I struggled to get pat_mc's algorithm to behave in a subroutine as it would produce a result when first invoked but throw a "Can't use an undefined value as an ARRAY reference at ..." error on being invoked a second time. However, I can't see an array reference anywhere in the code.

I had changed the code slightly so that the routine would work on a copy of the benchmark data so that it would be preserved rather than consumed. I used a package rather than lexical array to avoid "Variable ... will not stay shared ..." warnings and, if it ran, null results for the second and subsequent invocations. Here is a cut-down version of the benchmark code that demonstrates the problem, using pat_mc's and almut's routines to show the results

use strict; use warnings; my @words = qw{ cooling rooting hooting looking doormat cooking cookies noodles }; print qq{pat_mc : @{ [ pat_mc() ] }\n}; print qq{almut : @{ [ almut() ] }\n}; print qq{pat_mc : @{ [ pat_mc() ] }\n}; print qq{almut : @{ [ almut() ] }\n}; sub almut { my $w1 = $words[0]; my $and = "\xff" x length($w1); my $or = "\0" x length($w1); for my $w (@words) { $and &= $w; $or |= $w; } my $xor = $and ^ $or; $xor =~ tr/\0/\xff/c; my $mask = ~$xor; my $common = $w1 & $mask; $common =~ tr/\0/-/; return $common; } sub pat_mc { our @common_letters; our @wordsCopy = @words; my $reference = shift @wordsCopy; () = $reference =~ /(.)(?{ my $letter = $1; my $position = $-[0]; my $bolean = 1; for ( @wordsCopy ) { if ( substr( $_, $position, 1 ) ne $letter ) { $bolean = 0; last } } $common_letters[ $position ] = $letter if ( $bolean ); })/gx; return join '', map { $common_letters[ $_ ] || '-' } 0 .. length( $reference ) - 1; }

The output.

pat_mc : -oo---- almut : -oo---- Can't use an undefined value as an ARRAY reference at ./spw731537C lin +e 48.

I returned to the problem today and added a couple of print statements to the errant routine to try to confirm where it was going wrong. Bizarrely, the routine then started working as expected. It seems that adding a statement between the two package array declarations stops the error occuring.

... sub pat_mc { our @common_letters; my $dummy = 0; our @wordsCopy = @words; my $reference = shift @wordsCopy; () = $reference =~ /(.)(?{ my $letter = $1; my $position = $-[0]; my $bolean = 1; for ( @wordsCopy ) { if ( substr( $_, $position, 1 ) ne $letter ) { $bolean = 0; last } } $common_letters[ $position ] = $letter if ( $bolean ); })/gx; return join '', map { $common_letters[ $_ ] || '-' } 0 .. length( $reference ) - 1; }

The output again.

pat_mc : -oo---- almut : -oo---- pat_mc : -oo---- almut : -oo----

I can't see what could be causing this. The output of perl -MO=Deparse -e '...' looks identical other than the my $dummy = 0; statement being where you'd expect. I'm hoping that someone will be able to throw some light on what is going on.

Cheers,

JohnGG

P.S. The benchmarks results if anyone's interested, almut's method being the fastest by a considerable margin. oko1 and I shared the wooden spoon :-(

almut : -oo---- johngg : -oo---- oko1 : -oo---- pat_mc : -oo---- Rate oko1 johngg pat_mc almut oko1 223/s -- -13% -91% -98% johngg 256/s 15% -- -89% -97% pat_mc 2416/s 984% 843% -- -75% almut 9772/s 4284% 3714% 304% --

P.P.S. Just as I was about to post this I wondered whether initialising the @common_letters array rather than just declaring it would make a difference. Sure enough

... sub pat_mc { our @common_letters = (); our @wordsCopy = @words; my $reference = shift @wordsCopy; () = $reference =~ /(.)(?{ my $letter = $1; my $position = $-[0]; my $bolean = 1; for ( @wordsCopy ) { if ( substr( $_, $position, 1 ) ne $letter ) { $bolean = 0; last } } $common_letters[ $position ] = $letter if ( $bolean ); })/gx; return join '', map { $common_letters[ $_ ] || '-' } 0 .. length( $reference ) - 1; }

produces

pat_mc : -oo---- almut : -oo---- pat_mc : -oo---- almut : -oo----

Why would that be?


In reply to Strange "undefined value as an ARRAY reference" error by johngg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.