in reply to RE: RE: form parsing, hex, HTML formatting
in thread form parsing, hex, HTML formatting

I'm not entirely sure what is going on here.
Although its a particularly messy piece of code it does actually work! I was concerned by the results you got so I re-ran the code myself and was unable to get the same results.
Because I am lazy I missed off the -w and use strict in the code I posted (in my actual code everthing is a little more strict). It could be something to do with this that causes the effect you have seen. Although with -w I do not get any errors and it works OK. Try this:
#!/usr/local/bin/perl -w use strict; my $desc = "I %2Blike %3A cheese"; my ($str, @ob); print "TEST: ",test(),"\n";<br> sub test { $_ = $desc; while (/%[0-9A-Za-z]{2}/) { $_ = $&; /[0-9A-Za-z]{2}/; $ob[0] = hex($&); $str = pack("C*", @ob); $desc =~ s/%[0-9A-Za-z]{2}/$str/; $_ = $desc; } return $desc; } returns: TEST: I +like : cheese
This definitely works OK for me.
UPDATE: My apologies, I've just noticed what was wrong with the code that I originally posted. $name is not initially set up - so the loop runs through 1 iteration and fails. As I didn't want to actually change $str I changed the code a little - but forgot check it properly. This should work:
... 'myway' => sub { $_ = $str; $name = $str; while (/%[0-9A-Fa-f]{2}/) { $_ = $&; ...
I guess the lessons learnt are: BTW the new benchmark timings are:
Benchmark: timing 500000 iterations of myway, regexpway... myway: 162 wallclock secs (162.07 usr + 0.00 sys = 162.07 CPU) regexpway: 58 wallclock secs (58.52 usr + 0.01 sys = 58.53 CPU)
which make a lot more sense.
I now feel older, wiser and more than a little bit stupid.

Replies are listed 'Best First'.
RE: RE: RE: RE: form parsing, hex, HTML formatting
by takshaka (Friar) on May 23, 2000 at 02:12 UTC
    I figured you had a version that actually did work. FWIW, this is the only thing I could come up with that is as fast as the substitution. Unfortunately, it slows down when you put in the check for legit hex values (and it is much less readable).
    sub noregex { local $_ = $str; my $pos = 0; while ( (my $idx = index($_, '%', $pos)) > -1) { $pos = $idx + 1; my $code = substr($_, $pos, 2); substr $_, $idx, 3, pack("C*", hex $code); } return $_; } __END__ Benchmark: timing 100000 iterations of NO_REGEX, REGEX... NO_REGEX: 8 wallclock secs ( 8.77 usr + 0.00 sys = 8.77 CPU) @ 11 +402.51/s (n=100000) REGEX: 9 wallclock secs ( 9.15 usr + 0.00 sys = 9.15 CPU) @ 10 +928.96/s (n=100000)