It is helpful (at least it was for me) to think of a parser as a pipelined process.
First you tokenize the string, once you have broken all your text into the right bits, you pass that to the lexical analyzer (aka - lexer).
The lexer will then process the tokens and analyze them determining their "type". Bascially token1 is a string, token2 is an operator, token3 is a bracket, etc etc etc.
Once you have a set of properly classified tokens, you can then build an abstract syntax tree to represent their structure. This results in what is commonly called a "parse tree". If you are familiar with the XML/HTML-DOM, those are basically parse tree's of the XML/HTML documents.
At this point, you have your parse tree, and the parsing is completed. Now of course you need to figure out what to actually do with that parse tree :)
Now, the process I described is not the only way to parse, many parser do all this in one step, or combine a a couple steps together (tokenizing and lexical analysis are commonly combined together). But breaking it down into these steps was what helped me to learn how to write parsers. Hope this helps.
In reply to Re: Basics of parsing (using RTF as a testbed)
by stvn
in thread Basics of parsing (using RTF as a testbed)
by Mugatu
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |