in reply to RE: form parsing, hex, HTML formatting
in thread form parsing, hex, HTML formatting

Interesting perlversion you have. with me your code is slower. on using $& man perlvar states:
	The use of this variable anywhere in a program
	imposes a considerable performance penalty on all
	regular expression matches.  See the
	Devel::SawAmpersand module from CPAN for more
	information.
Which is consitant with these results:
Benchmark: timing 500000 iterations of regexpway...
 regexpway: 22 wallclock secs (22.50 usr +  0.05 sys = 22.55 CPU)

and

Benchmark: timing 500000 iterations of myway, regexpway...
     myway: 50 wallclock secs (49.41 usr +  0.18 sys = 49.59 CPU)
 regexpway: 25 wallclock secs (23.24 usr +  0.09 sys = 23.33 CPU)
  • Comment on RE: RE: form parsing, hex, HTML formatting

Replies are listed 'Best First'.
RE: RE: RE: form parsing, hex, HTML formatting
by mikfire (Deacon) on May 11, 2000 at 21:14 UTC
    /users/mfiresto/experiment)perl timeclean.pl
    Benchmark: timing 500000 iterations of myway, regexpway...
    myway: 24 wallclock secs (24.11 usr + 0.00 sys = 24.11 CPU)
    regexpway: 70 wallclock secs (67.60 usr + 0.00 sys = 67.60 CPU)

    /users/mfiresto/experiment)perl -v
    This is perl, version 5.005_03 built for sun4-solaris

    /users/mfiresto/experiment)uname -a
    SunOS XXXX 5.5.1 Generic_103640-28 sun4u sparc SUNW,Ultra-5_10

    /users/mfiresto/experiment)perl timeclean.pl
    Benchmark: timing 500000 iterations of myway, regexpway...
    myway: 19 wallclock secs (18.44 usr + 0.00 sys = 18.44 CPU)
    regexpway: 40 wallclock secs (40.65 usr + 0.00 sys = 40.65 CPU)

    /users/mfiresto/experiment)perl -v
    This is perl, version 5.005_03 built for i686-linux

    /users/mfiresto/experiment)uname -a
    Linux XXXX 2.0.36 #1 Tue Dec 29 13:11:13 EST 1998 i686 unknown

    And, just for something completely different, I thought to try this against perl 5.6.0 and the results are very similar.

    UPDATE
    It has dawned on me while reading this thread that this exposes one of the dangers on relying too heavily on Benchmark to determine the best algorithm. Notice the man pages for $& state it will impose a serious penalty on all other regex. In normal code for me, this would indeed be disasterous because I use a lot of regex.

    However, in the Benchmark code, there are only two regex statements used. With a bit of cleverness, I was able to remove the $& reference and received no real speed improvement. That is because Benchmark does not run in real-world conditions. We all tend to extract the part we wish to test and just run that. In this case, it may not work. If we were to test this code in a real world situation, we may see a difference.

    Then again, we may not. According to some quick experiments, Benchmark cannot reliably measure the first method in less than ( approximately ) 9500 iterations. Personally, I have not seen a CGI parameter that contains 9500 'escaped' characters. Do the benchmarks at this point really mean anything?

    Shouldn't we be more concerned with good code? How about which one is easier to maintain? How about which one is more Perlish? Which one fits the coder better?

    Wow. Sorry to rant. I have been thinking about this too much.

    Mik

RE: RE: RE: form parsing, hex, HTML formatting
by muppetBoy (Pilgrim) on May 11, 2000 at 16:44 UTC
    That was exactly what I was expecting, I thought the $& would cause a big performance hit. I'm even more puzzled by my results now :-(
    Incidently I'm running v. 5.005003 on Solaris.