in reply to Seeking Advice: Writing a parser

One problem you will need to think about that I think nobody else has pointed out yet is that mathematics, as conventionally written, is ambiguous. For example, if you get an expression like b(x+1), it may not be clear whether the intent is to multiply the quantity b by x+1 or to give x+1 as an argument to the function b. Mathematicians use two things to determine which case holds: One is information from the surrounding mathematical context; if b has been previously described as being a function, then you know the second interpretation is correct. Making use of contextual details is tricky in a parser. The other thing the mathematicians look at is subtle typographical distinctions, which aren't available to you here.

Computer languages get around this by requiring explicit multiplication symbols: The expression above becomes b*(x+1) if multiplication is intended. Your example above indicates that you want implicit multiplication, and that can be very tricky. If you see something like y = mx, how will you know how to parse it? Is mx one variable, or is it the product of m and x? These are the sorts of side problems that you'll have to solve to build a parser that works the way you want it to.

You might want to consider first writing a parser for a simpler and less ambiguous language---say, arithmetic expressions involving only numerals, with explicit multiplication. Once you have some experience solving the simple problem, you can go back and embellish it to handle more complex expressions.

--
Mark Dominus
Perl Paraphernalia

Replies are listed 'Best First'.
Re: Re: Seeking Advice: Writing a parser
by belg4mit (Prior) on Feb 17, 2002 at 06:07 UTC
    Just to clarify your first point; for the benefit of those w/o an education from the American public schools ;-) it's PEMDAS!

    1. Parantheses (I would imagine trig functions go here)
    2. Exponents (I would imagine logarythms go here)
    3. Multiplication, Division
    4. Addition, Subtraction

    For the second part you can be smart. Declare your parser as not lazy, then if mx has not yet been seen but m and x have there is implicit multiplication. On the gripping hand say m, x, and mx are defined. I'd opt for mx and force explicit multiplcation for m and x.

    --
    perl -pe "s/\b;([st])/'\1/mg"