in reply to HOP::Lexer not doing what I expected

Order is important:

------------- 8< --------- my $lexer = make_lexer( sub { shift @sql }, # iterator [ DQUOTED => qr/"[^"]+"/ ], [ QUOTED => qr/'[^']*'/ ], [ DQWORD => qr/"\w+"/ ], [ WORD => qr/\w+/i ], [ COMMA => qr/,/ ], [ SPACE => qr/\s+/, sub {} ], ); ------------- 8< ---------

Prints:

['WORD','select'], ['WORD','case'], ['WORD','when'], ['WORD','a'], '=', ['WORD','b'], ['WORD','then'], ['QUOTED','\'c\''], ['WORD','else'], ['QUOTED','\'d\''], ['WORD','end'], ['DQUOTED','"tough_one"'], ['COMMA',','], ['WORD','e'], ['WORD','as'], ['DQUOTED','"even tougher"'], ['WORD','from'], ['WORD','mytable']

DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: HOP::Lexer not doing what I expected
by Corion (Patriarch) on Nov 11, 2006 at 18:50 UTC

    But your order is wrong:

    ['DQUOTED','"tough_one"'],

    shouldn't be a DQUOTED but a DQWORD - your code won't ever match a DQWORD, which I think was the original goal.

      Yes but at least he got QUOTED to match, which is something I couldn't do.

      I just can't make sense of the ordering rules.

        This gives the output you wanted:
        [ DQWORD => qr/"\w+"/ ], [ DQUOTED => qr/"[^"]+"/ ], [ QUOTED => qr/'[^']*'/ ], [ WORD => qr/\w+/i ], [ COMMA => qr/,/ ], [ SPACE => qr/\s+/, sub {} ],
        Note that we need to place DQWORD before DQUOTED since the first is a subset of the second (everything first matches second will match). QUOTED can be before or after the double quoted rules. Still don't understand why WORD can't be the first one, though.
Re^2: HOP::Lexer not doing what I expected
by ikegami (Patriarch) on Nov 11, 2006 at 18:49 UTC
    Why? WORD and DQUOTED can't match the same thing. Or is the match unanchored (i.e. preceeded by .*?)?

      I'm wondering that myself! I installed HOP::Lexer to have a play and learn (I've not used the module before). It seemed that WORD was matching first and changing the order fixed that as I expected. It was not obvious why it should match first however.

      I've now skim read the documentation (including HOP::Lexer::Article) and still don't understand why WORD was matching! Maybe time to trawl through the code?


      DWIM is Perl's answer to Gödel