in my incoming text hop:lexer seems to incorrectly advance to the next token on qualified names. Am I doing something wrong?

For example when it encounters "Enterprise Warehouse"."Charge Date" AS "Charge Date"

I get the following output...

next FQNAME2 Enterprise Warehouse.Charge Date

current: FQNAME2 Enterprise Warehouse.Charge Date

next FQNAME1 Enterprise Warehouse

current: FQNAME1 Enterprise Warehouse

next FQNAME1 Enterprise Warehouse

current: FQNAME1 Enterprise Warehouse

code below

use strict; use warnings; use HOP::Lexer 'make_lexer'; my $sql = <<END_SQL; DECLARE DIMENSION "Enterprise Warehouse"."Charge Date" AS "Charge Date +" UPGRADE ID 8444734 ON ( "Enterprise Warehouse"."Charge Date"."Charge Date Total" ) D +EFAULT ROOT "Enterprise Warehouse"."Charge Date"."Charge Date Total" DESCRIPTION {This is the date hierarchy for the Charge date.} PRIVILEGES ( READ); END_SQL my @keywords = ( 'DECLARE DIMENSION', 'AS', 'UPGRADE ID', 'DESCRIPTION', 'PRIVILEGES', 'ON', 'DEFAULT ROOT' ); my @sql = $sql; my $lexer = make_lexer( sub { shift @sql }, [ 'KEYWORD', qr/(?i:@{[join '|', map {$_} @keywords]})/ ], [ 'UPRGADEID', qr/\d+/ ], [ 'COMMA', qr/,/ ], [ 'PAREN', qr/\(/, sub { [shift, 1] } ], [ 'PAREN', qr/\)/, sub { [shift, -1] } ], [ 'BRACE', qr/\{/, sub { [shift, 1] } ], [ 'BRACE', qr/\}/, sub { [shift, -1] } ], # [ 'TEXT', qr/\([^\(]+\)\)/, \&text ], # [ 'TEXT', qr/({[^{]+})/, \&text ], [ 'FQNAME3', qr/("[^"]+".){2}\"[^"]+"/, \&text ], [ 'FQNAME2', qr/("[^"]+".)\"[^"]+"/, \&text ], [ 'FQNAME1', qr/("[^"]+")/, \&text ], [ 'PERIOD', qr/\./], [ 'SPACE', qr/\s*/, sub {} ], ); sub text { my ( $label, $value ) = @_; $value =~ s/["']//; $value =~ s/["']$//; $value =~ s/\".\"/./; return [ $label, $value ]; } my $inside_parens = 0; while ( defined ( my $token = $lexer->() ) ) { my ( $label, $value ) = @$token; $inside_parens += $value if 'PAREN' eq $label; print "current: $label $value \n"; my $next = $lexer->('peek'); my ( $next_label, $next_value ) = @$next; print "next $next_label $next_value \n"; # next if $inside_parens || 'TEXT' ne $label; if ( defined ( my $next = $lexer->('peek') ) ) { my ( $next_label, $next_value ) = @$next; if ( 'COMMA' eq $next_label ) { print "$value\n"; } elsif ( 'KEYWORD' eq $next_label && 'from' eq $next_value ) { print "$value\n"; last; # we're done } } }

In reply to hop:lexer question by Freewilly3d

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.