Hmm, isn't substr usually faster than a regex? If so, how about the following approach:

Well, I'm sure that it would work, but would it be faster? I'll probably benchmark this myself sometime when I have the time to create data and code to test.

Anyway, that's my (Not-So-)Good Idea for the day.

Update I just ran some benchmarks on a few of the methods suggested. Here's my code and results:

my $str = '<!-- USER 20 - donkey_pusher_6 -->'; my $data; my $re = qr/--\s*USER\s+\d+\s*-\s*(\w+)/; my ($start, $end); sub by_re_noback { ($data) = ($str =~ / ^ (?>\s*) <!-- (?>\s+) USER (?>\s+) (?>\d+) (?> +\s+) - (?>\s+) (\S+?) (?>\s+) --> (?>\s*) $ /ix); } sub by_re { ($data) = ($line =~ m/<!-- USER \d+ - (\S+)/i); } sub by_re_comp { ($data) = ($str =~ $re); } sub by_substr { $end = rindex($str, ' '); $start = rindex($str, ' ', $end - 1); $data = substr($str, $start + 1, $end - $start); } timethese (100000, { subst => \&by_substr, re_comp => \&by_re_comp, re => \&by_re, re_noback => \&by_re_noback, }); --results-- Benchmark: timing 100000 iterations of re, re_comp, re_noback, subst.. +. re: 1 wallclock secs ( 0.46 usr + 0.00 sys = 0.46 CPU) @ 21 +7391.30/s (n=100000) re_comp: 4 wallclock secs ( 4.35 usr + 0.00 sys = 4.35 CPU) @ 22 +988.51/s (n=100000) re_noback: 6 wallclock secs ( 6.27 usr + 0.00 sys = 6.27 CPU) @ 15 +948.96/s (n=100000) subst: 1 wallclock secs ( 1.40 usr + 0.00 sys = 1.40 CPU) @ 71 +428.57/s (n=100000)

--

There are 10 kinds of people -- those that understand binary, and those that don't.


In reply to Re: Regex simplification by mephit
in thread [untitled node, ID 192753] by Samn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.