Would this work? I don't have the ops integrated into Parrot's normal set yet, but I'd like to know if I'm on the right track. re_match is like calling RE_0 as a subroutine, and re_finished is like returning from it. The set is an assignment (in this case to Integer register 0); pretty much everything else that may look odd in here has been explained inline.
re_match RE_0, "afoobarz" RE_0: # /fo*?bar/s (and yes, the "s" flag is pointless) re_flags "s" re_minlength 4 $start: #a 'local' label--it disappears the next time a normal # label is seen re_pushindex #in case we fail in $find_o or something re_literal "f", $find_bar re_popindex re_advance $failure branch $start #think "goto" $find_o: re_literal "o", $find_bar branch $start $find_bar: re_literal "bar", $success branch $find_o $success: set I0, 1 branch $end $failure: set I0, 0 $end: re_finished
Looking at this, it seems that most of the time we use the second parameter we're trying to avoid a branch put in in case we fail; is this just a quirk of this example, or might it make more sense for the second parameter to represent where to jump if we fail?
$start: re_pushindex re_literal "f", $advance $find_bar: re_literal "bar", $find_o #if we made it this far, we're done! set I0, 1 branch $end $find_o: re_literal "o", $advance branch $find_bar $advance: re_popindex re_advance $failure branch $start $failure: set I0, 0 $end: re_finished
This would be a bigger win in the case of, say, several character classes in a row:
RE_1: #/[az][by][cx]/ re_flags "" re_minlength 3 $start: re_pushindex re_oneof "az", $advance re_oneof "by", $advance re_oneof "cx", $advance set I0, 1 branch $end $advance: re_popindex re_advance $fail branch $start $fail: set I0, 0 $end: re_finished

=cut
--Brent Dax
There is no sig.


In reply to Re: Re: Re: Re: Flattening REs into opcodes for Perl 6 by BrentDax
in thread Flattening REs into opcodes for Perl 6 by BrentDax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.