![]() |
|
P is for Practical | |
PerlMonks |
mikeraz's scratchpadby mikeraz (Friar) |
on May 26, 2005 at 23:39 UTC ( #460907=scratchpad: print w/replies, xml ) | Need Help?? |
Palindrome codelet moved here Re: regex at word boundary Testing against QM's RE solution. Also an Update on the effect of using Regexp::Common, see bottom. mikeraz@tire:~/pal$ file tdata tdata: ISO-8859 mail text, with very long lines mikeraz@tire:~/pal$ wc tdata 216527 539118 13345250 tdata mikeraz@tire:~/pal$ time ./mr_pal.pl tdata > mr.out ; time ./qm_pal.pl tdata > qm.out real 0m9.914s user 0m9.799s sys 0m0.073s real 1m37.445s user 1m32.658s sys 0m0.135s mikeraz@tire:~/pal$ perl -v This is perl, v5.8.7 built for i486-linux-gnu-thread-multitdata is a mbox file with 591 messages. Repeated runs return essentially the same results. Note: these tests were performed earlier on a Solaris Sparc 5 with Perl v5.6.1 and the QM solution did not work at all, the RE used is not valid in that environment. I'm surprised here by the extreme difference in the run time of the two routines. Here is the acutal code snippets I'm using in the test. mr_pal.pl qm_pal.pl
Update: to not reinvent the wheel I installed Regexp::Common and modified my code to use it: mikeraz@tire:~/pal$ diff -u mr_pal.pl re_pal.pl --- mr_pal.pl 2005-12-12 08:24:57.000000000 -0800 +++ re_pal.pl 2005-12-13 06:47:01.080509464 -0800 @@ -1,6 +1,10 @@ #!/usr/bin/perl -w use strict; use POSIX; +use Regexp::Common qw /lingua/; + +# /^$RE{lingua}{palindrome}$/ and print "is a palindrome\n"; + # find palindromes in text file @@ -25,7 +29,7 @@ # test for palindrome $test = lc join "", @F0..$i; - if($test eq reverse $test) { + if($test =~ /^$RE{lingua}{palindrome}$/) { $pal = join " ", @F0..$i; print "$. $pal\n"; }and it's runtime is also slow, as in: mikeraz@tire:~/pal$ time ./re_pal.pl tdata > out.re real 1m3.960s user 1m2.199s sys 0m0.133sPerhaps I should unroll my logic a bit so the lingua test acts on the whole line, rather than my decomposing it first, So it seems that it is optimal to find a palindrome candidate, strip out punctuation and spacing, and test by comparing the string against a reverse of itself. What are the holes in this analysis? |
|