in reply to Re: Balancing braces without (??{})
in thread Balancing braces without (??{})

Ah, the Push down automata, I knew it would return to haunt me someday. Applied theory with Professor Decker...how I miss it q-; Thanks for this very helpful info John. Actually, my previous solution was on a similar track but instead of implementing a stack to push and pop I was doing something more hackish and just using a variable that would increment for each { and decrement for each } and return the brace that brought the var back to zero. This was a very hackish approach to the problem, and your stack suggestion has obvious advantages aside from its more poetic approach. I'll take a look at Parse::YAPP, sounds like a useful read. Aside from the fact that (??{}) sequence makes the regex code more compact than implementing a stack, are there advantages or disadvantages in terms of runtime, memory, etc? I guess what I'm asking is, if you had perl 5.7 running on your machine and a rainy afternoon to spend writing the best code you could, which technique would you choose? I am very interested in making my code the best it can be, since I hope others will find it useful and want to use it as well. Gotta work on my skills before that hallowed day, though.

Replies are listed 'Best First'.
Re: Re: Re: Balancing braces without (??{})
by John M. Dlugosz (Monsignor) on Jul 28, 2001 at 01:47 UTC
    I'd worry about the documentation that states, “WARNING: This extended regular expression feature is considered highly experimental, and may be changed or deleted without notice.”

    So, it might not work at all on Perl 5.8 when it comes out.

    If I had to write a good efficient implementation without using ??{}, I'd probably take what I showed you and do it in chunks rather than one char at a time. That is, chop the string into chunks where each chunk is a single ballancing char, or the whole run between them. Then perform the algorithm on that list, rather than every single char.

    Perhaps there should be a feature that elegantly does P.D.A., not just regexp, without getting into all the complexity of full LALR grammar. Perhaps stacking could be added as regex extenstions without generalized code blocks: a way to push something, and zero-width assertions and subexpressions that can refer to the stack top, and pop.

    —John

Re: Re: Re: Balancing braces without (??{})
by John M. Dlugosz (Monsignor) on Jul 29, 2001 at 00:18 UTC
    Thinking about it again, I'd make a slight change. Instead of chopping it up first, I'd use a search for brace characters with /g as the condition of a while loop, so the body can examine pos() and push and pop and stuff, and all the uninteresting characters are skipped over.