comment on

In a word, no.

Reversing the regex is much faster.
Have a look at these benchmarks:

#!/usr/bin/perl -w

use strict;
use Benchmark;

my $string = "<<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> ;strip_
+me";

sub reversed {
  my $reverse = reverse(shift);
  $reverse =~ s| \w* ; \s* > |>|x;
  return scalar reverse $reverse;
}

sub greedy {
  my $line = shift;
  $line =~ s|^ (.*>) \s* ; \w* |$1|x;
  return $line;
}

print "Reversed: ", reversed($string), "\n";
print "Greedy: ", greedy($string), "\n";

timethese( -10,{
                reversed => sub { reversed( $string ) },
                greedy => sub { greedy( $string ) },
               } );
[download]

Output:

Reversed: <<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> Greedy: <<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> Benchmark: running greedy, reversed, each for at least 10 CPU seconds... greedy: 10 wallclock secs ( 9.98 usr + 0.02 sys = 10.00 CPU) @ 78480.80/s (n=784808) reversed: 11 wallclock secs (10.46 usr + 0.00 sys = 10.46 CPU) @ 167660.04/s (n=1753724)

As you can see, it's over twice the speed. On longer strings, the difference would be even greater.

Also, your regex is wrong. Read through perldoc:perlre (specifically, the section marked 'Warning on \1 vs $1') to discover why.

In reply to Re: Re: Re: parsing question by kilinrax
in thread parsing question by Washie101

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Pathologically Eclectic Rubbish Lister
	PerlMonks