dannoura has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
I tried using the different methods proposed in this thread to speed up my regex. There are basically three methods (the code is posted below) and I timed them with dprof. The result is
Total Elapsed Time = 0.719999 Seconds User+System Time = 0.186597 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 17.1 0.032 0.032 2335 0.0000 0.0000 main::method2 16.6 0.031 0.031 2335 0.0000 0.0000 main::method3 8.57 0.016 0.016 2 0.0080 0.0080 main::BEGIN 0.00 0.000 -0.000 1 0.0000 - strict::import 0.00 0.000 -0.000 1 0.0000 - strict::bits 0.00 0.000 -0.000 1 0.0000 - warnings::BEGIN 0.00 0.000 -0.000 1 0.0000 - Exporter::import 0.00 0.000 -0.000 1 0.0000 - warnings::import 0.00 0.000 -0.000 2335 0.0000 - main::method1
The question is why is method1 so much more faster than the others? I'm guessing the people who replied to my post are all experienced programmers so I can see no reason why one method would be so much faster than the other. Does anyone have any ideas about this?
#! c:\perl\bin use strict; use warnings; my $text=""; my @abstracts=(); open(FH, "pc clean25.txt") or die ("can't"); $text=join("",<FH>); (@abstracts)=($text=~/<ABSTRACT>(.*?)<\/ABSTRACT>/sg); foreach my $abstract (@abstracts) { print "method 1: ",method1($abstract, "prostate"),"method 1: ",method2 +($abstract, "prostate"),"method 3: ",method3($abstract, "prostate"); } sub method1 { my $text=shift; my $gene=shift; my $count = () = $text =~ /$gene/g; return $count; } sub method2 { my $text=shift; my $gene=shift; my $count=0; my $p=0; ++$count while $p = 1+index( $text, $gene, $p ); return $count; } sub method3 { my $text=shift; my $gene=shift; my $count=0; my $patn = qr/\b$gene\b/; $count++ while $text =~ /$patn/g; return $count }
And the file simply contains a lot of abstracts.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Regular expression speed issues
by Tomte (Priest) on Jul 22, 2003 at 10:04 UTC | |
|
Re: Regular expression speed issues
by gjb (Vicar) on Jul 22, 2003 at 08:36 UTC | |
|
Re: Regular expression speed issues
by Abigail-II (Bishop) on Jul 22, 2003 at 07:54 UTC | |
by TimToady (Parson) on Jul 22, 2003 at 16:08 UTC | |
by dannoura (Pilgrim) on Jul 22, 2003 at 08:02 UTC | |
|
Re: Regular expression speed issues
by TomDLux (Vicar) on Jul 22, 2003 at 08:09 UTC | |
by dannoura (Pilgrim) on Jul 22, 2003 at 08:28 UTC |