The custom when dealing with line numbers is to make the attribute associated with the token somewhat more complex: an anonymous array or hash.
The following is an example of a Lexer for a C-like language written in Parse::Eyapp. The token attribute is an array having the value and the line number. Thus, for a number we return
('INUM',[$1, $tokenbegin])
The hash %reserved contains the reserved words of the language. Lines 41-51 deal with lexemes as '=' and '**'. The hash %lexeme has s.t. like:$ sed -ne '513,567p' Types.eyp | cat -n 1 sub _Lexer { 2 my($parser)=shift; 3 4 my $token; 5 for ($parser->YYData->{INPUT}) { # warn! false for 6 return('',undef) if !defined($_) or $_ eq ''; 7 8 #Skip blanks 9 s{\A 10 ((?: 11 \s+ # any white space char 12 | /\*.*?\*/ # C like comments 13 )+ 14 ) 15 } 16 {}xs 17 and do { # Count line numbers 18 my($blanks)=$1; 19 20 #Maybe At EOF 21 return('', undef) if $_ eq ''; 22 $tokenend += $blanks =~ tr/\n//; 23 }; 24 25 $tokenbegin = $tokenend; 26 27 s/^('.')// 28 and return('CHARCONSTANT', [$1, $tokenbegin]); 29 30 s/^([0-9]+(?:\.[0-9]+)?)// 31 and return('INUM',[$1, $tokenbegin]); 32 33 s/^([A-Za-z][A-Za-z0-9_]*)// 34 and do { 35 my $word = $1; 36 my $r; 37 return ($r, [$r, $tokenbegin]) if defined($r = $reserved +{$word}); 38 return('ID',[$word, $tokenbegin]); 39 }; 40 41 m/^(\S\S)/ and defined($token = $1) and exists($lexeme{$tok +en}) 42 and do { 43 s/..//; 44 return ($token, [$token, $tokenbegin]); 45 }; # do 46 47 m/^(\S)/ and defined($token = $1) and exists($lexeme{$token +}) 48 and do { 49 s/.//; 50 return ($token, [$token, $tokenbegin]); 51 }; # do 52 53 die "Unexpected character at $tokenbegin\n"; 54 } # for 55 }
my %lexeme = ( '=' => "ASSIGN", ................. ']' => "RIGHTBRACKET", '==' => "EQ", '+=' => "PLUSASSIGN", ................... '--' => "DEC", '**' => "EXP" );
Hope it helps
Casiano
In reply to Re: Parsing using Parse::YYLex package
by casiano
in thread Parsing using Parse::YYLex package
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |