comment on

Coincidentally enough, I've been working on a lexer/parser combo (in C++) for one of my classes, so I guess I'll take a stab at answering this.

The lexer has to be smart enough to recognize quoted strings as individual elements, consequently it would return something along the lines of

('my', '$x', '=', 'my $x = \"my $x\""', ';')
[download]

Some lexers might even be smart enough to classify tokens, and return something like this:

([ 'MY' => 'my'], ['VAR' => '$x'], ['OPERATOR', '='], ['STRING' => 'my
+ $x = \"my $x\""'], [TERMINATOR => ';']).
[download]

To clarify, the lexer needs to understand the nature of the tokens that it generates, at a minimum knowing things such as which characters can be used in an identifier, and what the quoting rules are. (Quick, what does perl -e 'for qw(foo bar) { print $_ }' do?) The parser's knowledge relates to what order tokens can legally come in, and what different combinations of tokens mean.

If you're interested in reading more about specifically parsing perl, you should read merlyn's On Parsing Perl.

--
Ryan Koppenhaver, Aspiring Perl Hacker
"I ask for so little. Just fear me, love me, do as I say and I will be your slave."

In reply to Re: The Relation Between Lexers and Parsers by rlk
in thread The Relation Between Lexers and Parsers by beppu

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.