I have difficulty understanding the article. In my humble opinion it is rather theoretical. Maybe I am not worthy ;-)
No, it probably means that you are not the intended target audience. Or that I did a bad job at writing.
The "slightly" is an understatement. Some of my nightmare examples to illustrate the point:
You're right, I underestimated the complexity. I thought you could just take lines, and multiple lines if they contained non-closed quoted strings.
Still you should not give up hope. I wrote a simple lexer that works for the example you gave:
use strict; use warnings; use Data::Dumper; use Math::Expression::Evaluator::Lexer qw(lex); my $d = do { local $/; <DATA> }; my @tokens = ( ['Commment', qr{/\*.*?\*/}s, sub { return }], ['Identifier', qr{[a-zA-Z_]\w+}], ['Number', qr{\d+}], ['Operator', qr{[=(),+-/*{}]}], ['Quoted String', qr{"[^"]*"}], ['Newline', qr{\n}], ['Whitespace', qr{\s+}, sub { return }], ); print Dumper lex($d, \@tokens); __DATA__ /* A 2-dimensional sequence as the value is being called in ODL */ KEYWORD = ((1,2) (3,4) (5,8) /* some comment */ 9,11)) /* A set as the value is being called in ODL */ KEYWORD = { RED, BLUE, /* some comment */ GREEN, HAZEL } /* A text string spanning multiple lines */ KEYWORD = "some text /* not a comment but part of the value! */ more text even more text" /* this is again a comment*/
This is far from ideal, but it does tokenize the data in a meaningful way, and strips comments, but not those inside quoted strings.
(The lexer in Math::Expression::Evaluator::Lexer is quite simple and not iterator-like. If you don't want to read all input at once, you need to come up with something more sophisticated.)
In reply to Re^3: Writing an ODL parser?
by moritz
in thread Writing an ODL parser?
by dHarry
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |