Hrm. Actually, that's what the \G is for. It anchors to the place that the last /g match matched (sorta like the ^ anchor) and defaults to the start of the string (if no previous matches were made). So, theoretically, the first RE shouldn't match at all until after the second RE has matched at least once. And that's my problem: it isn't working that way.

Also, there are several other RE's in the actual tokenizer so alternation isn't really an option.

I built my example more or less out of the snippet provided here under the section about the /g modifier, and the other parts of the "lex-like scanner" (as it's called in the perlop page) work just fine. The problem seems to be totally with the \\? (specifically the ? part).

I do thank you for the suggestion though. :-)

(PS: Just so it's clear: the idea is to match an identifier preceded by an ampersand and, optionally, a single backslash. The characters infront of the ampersanded identifier (except for the optional backslash) should be matched by the second RE, effectively breaking the string into "tokens". The actual routine is a bit more complicated and has a lot of other, unrelated code that I took out for (sanity|readibility)'s sake.)

(PPS: Okay, so I used a dollar-sign in the code snippet. It's an ampersand in the actual routine, I promise. :-) And it doesn't matter anyway.)

bbfu
Seasons don't fear The Reaper.
Nor do the wind, the sun, and the rain.
We can be like they are.


In reply to (bbfu)(logical 'or' short-circut)Re: Re: RE prollem: \G, \\? and disappearing data by bbfu
in thread RE prollem: \G, \\? and disappearing data by bbfu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.