Re: Finding string for patterns

use strict;
use warnings;

my $SQUARE;
my $MIN_NUMBER = 4;
my $MAX_NUMBER = 1000;
my $string = q(59738219024971503664);
my $total_length;

my %seen;
for my $start (0..length $string) {
    next if substr($string, $start, 1) eq '0';
    for my $length (1..(length($string)-$start)) {
        my $test = substr($string, $start, $length);
        my $sqrt = sqrt($test);
        if (!$seen{$test}++ and $sqrt == int $sqrt
            and $sqrt >= $MIN_NUMBER and $sqrt <= $MAX_NUMBER) {
            $total_length += $length;
            print "$test\t$length\t$total_length\n";
        }
    }
}
[download]

Caution: Contents may have been coded under pressure.

Comment on Re: Finding string for patterns Download Code

Replies are listed 'Best First'.
Re^2: Finding string for patterns by cazz (Pilgrim) on Apr 05, 2005 at 21:58 UTC
That may be fast for short strings, but given longer strings, it is much faster to precompute and then compare. use Benchmark qw/:all/; use strict; my $SQUARE; my $MIN_NUMBER = 4; my $MAX_NUMBER = 1000; foreach my $number ($MIN_NUMBER..$MAX_NUMBER){ my $square = $number * $number; $SQUARE->{$square} = $number; } my $string = q(5973821902497150366459738219024971503664); sub m_re { my $total_length; foreach my $square (keys %{$SQUARE}){ if($string =~ /$square/) { my $length = length($square); $total_length += $length; } } return ($total_length); } sub m_index { my $total_length; foreach my $square (keys %{$SQUARE}){ if(index($string, $square) > 0) { my $length = length($square); $total_length += $length; } } return ($total_length); } sub m_sqrt { my $total_length; my %seen; for my $start (0..length $string) { next if substr($string, $start, 1) eq '0'; for my $length (1..(length($string)-$start)) { my $test = substr($string, $start, $length); my $sqrt = sqrt($test); if (!$seen{$test}++ and $sqrt == int $sqrt and $sqrt >= $MIN_NUMBER and $sqrt <= $MAX_NUMBER) + { $total_length += $length; } } } return ($total_length); } cmpthese(1000, { "Regexp", \&m_re, "Index", \&m_index, "Sqrt", \&m_sqrt, }); [download] Gives: Benchmark: timing 1000 iterations of Index, Regexp, Sqrt... Index: 1 wallclock secs ( 0.98 usr + 0.01 sys = 0.99 CPU) @ 1010.10/s (n=1000) Regexp: 13 wallclock secs (12.20 usr + 0.02 sys = 12.22 CPU) @ 81.83/s (n=1000) Sqrt: 2 wallclock secs ( 2.27 usr + 0.01 sys = 2.28 CPU) @ 438.60/s (n=1000) Rate Regexp Sqrt Index Regexp 81.8/s -- -81% -92% Sqrt 439/s 436% -- -57% Index 1010/s 1134% 130% -- (update: useless use of sub)	[reply] [d/l]
Re^3: Finding string for patterns by Roy Johnson (Monsignor) on Apr 05, 2005 at 22:33 UTC
It's a tradeoff. As the OP noted, if you increase the range of squares you're checking, precomputing gets slower (and it's time you don't count in your benchmark). Stick another 0 on MAX_NUMBER and run your bench again (change the first arg for cmpthese to -10), and you might get something like I did: `Rate Regexp Index Sqrt Regexp 3.82/s -- -88% -98% Index 32.4/s 748% -- -81% Sqrt 167/s 4281% 417% --` [download] One relatively easy optimization that I didn't do is to limit the length to what can actually be matched: sub m_sqrt { my $total_length; my %seen; my $max_length = length($MAX_NUMBER * $MAX_NUMBER); for my $start (0..length $string) { next if substr($string, $start, 1) eq '0'; my $this_max = length($string)-$start; $this_max = $max_length if $max_length < $this_max; for my $length (1..$this_max) { my $test = substr($string, $start, $length); my $sqrt = sqrt($test); if (!$seen{$test}++ and $sqrt == int $sqrt and $sqrt >= $MIN_NUMBER and $sqrt <= $MAX_NUMBER) + { $total_length += $length; } } } return ($total_length); } [download] Now the bench gives me `Rate Regexp Index Sqrt Regexp 3.64/s -- -88% -99% Index 31.2/s 758% -- -94% Sqrt 497/s 13542% 1489% --` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l] [select]