As far as I understand, the ability to backreference in Perl's regular expressions makes this problem difficult at best. NDFA's (Non-Deterministic Finite Automata) can be represented as a set of points with certain lines connecting them. (Note: if you don't understand that, then I'm not going to explain it. Its not that I can't... its that I don't really have time right now. Its fairly complicated.) I read an article recently (can't remember where) describing Perl's regex engine as constructing NDFA's and working through them. The article described backreferencing as placing little recorders in certain places with record, stop and play buttons scattered about. I think this breaks the NDFA analogy and makes the rest of your argument null and void.
Allright... I was about to start going into a flashback of my programming theory courses (I started thinking about Context-Free Grammars, Turing Machines, regular languages and NP-complete. it was ugly.) but I'll spare everyone that. In short... I think that if this was possible, and reasonably easy, someone would have done it by now. But if you think you can create something that'll do a reasonably good job, go for it and let me know how it goes.
jeff
Update: Thank you
BlaisePascal for confirming my thoughts (see his point #2).
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.