in reply to Re^2: HOP::Lexer not doing what I expected
in thread HOP::Lexer not doing what I expected

Word boundaries? Hmm... interesting take. It's not something that's been mentioned in the docs, or in the perl.com article.

Where it really does go wrong, in my opinion, is that it doesn't make any attempt to try and find a leftmost match. That's what all lexers are supposed to do. So you can rightfully argue that it must find "select" in the string "selectx", it makes no sense to skip the first "x" in "xselectx". No other lexer or parser in the world would do that, not by design.

  • Comment on Re^3: HOP::Lexer not doing what I expected

Replies are listed 'Best First'.
Re^4: HOP::Lexer not doing what I expected
by cmarcelo (Scribe) on Nov 11, 2006 at 22:39 UTC

    The question of word boundaries doesn't show up because the example author uses doesn't need it. So all works fine (at least in the article). But in your example that makes a difference.

    I know very little about lexers, but I agree that using split causes unexpected behavior (not matching the leftmost rule), but has proven useful in the example of the article, where it creates rules only for what matters (ignoring the = symbol, for example). I don't know how hard/easy would be to do that for leftmost rule matching. split use here is convenient.

    And note, I didn't tell that x must be skipped (considered garbage), at least considering the rules I mentioned, but it's matched by WORD, then KEYWORD matches select. HOP::Lexer knows nothing about boundaries, neither give special meaning to \s, you must tell him if you want just match select in " select " or " select, " but not in "selectx".