in reply to suffix arrays

You can get away with using the Lvalues from substr in the array as suggested by Zaxo above, but you have to jump through a few hoops to prevent the problem I alluded to.

Basically, there seems to be a single slot allocated for all the Lvalues generated by any given statement that includes a substr. That's a clumsy way to put it, but I can't think of a better way to say it.

To work around the limitation, it is necessary to use a 'different' substr each time, hence the need for the eval in this code. The downside is that the eval markly reduces the efficiency. The algorithm works fine though and this implementation shoudl be a bit quicker than yours.

#! perl -slw use strict; sub ATOa{ map{ eval '\substr( $_[0], $_ )' } 0 .. length( $_[0] ); } my @Aoa = sort{ $$a cmp $$b } ATOa $ARGV[0]; #print $$_ for @Aoa; my $max = ''; for my $i ( 0 .. ($#Aoa-1) ) { my $p = 0; ++$p while substr( ${$Aoa[ $i ]}, 0, $p ) eq substr( ${$Aoa[ $i+1 ]}, 0, $p ); $max = substr( ${$Aoa[ $i ]}, 0, $p-1 ) if length $max < $p; } print "The longest common substring in\n'$ARGV[0]'\n is '$max'"; __DATA__ P:\test>test "the infernal invention was the source or eternal facinat +ion for the qualified professional and enthusiastic amateur alike" The longest common substring in 'the infernal invention was the source of eternal facination for the q +ualified professional and enthusiastic amateur alike' is 'ernal ' P:\test>test "Four and twenty virgins came down from Inverness. And wh +en they went back again there were four and twenty less" The longest common substring in 'Four and twenty virgins came down from Inverness. And when they went +back again there were four and twenty less' is 'our and twenty '

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

Replies are listed 'Best First'.
Re: Re: suffix arrays
by sauoq (Abbot) on Jul 19, 2003 at 00:15 UTC
    The downside is that the eval markly reduces the efficiency.

    So, just do one eval() . . .

    sub ATOa { my $code = '(' . join( ',', map { "\\substr(\$_[0], $_)" } 0 .. length( +$_[0] ) ) . ')'; return eval $code; }
    :-)

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: Re: suffix arrays
by cees (Curate) on Jul 19, 2003 at 06:20 UTC

    You can improve the performance slightly by only looking for matching strings larger than the current maximum.

    my $p = length($max);

    - Cees