You need to read
Mastering Regular Expressions. Especially Chapter 4: The Mechanics of Expression Processing. There are basically two major types of regex engines: NFA and DFA. Each type has its own strengths and weaknesses. Perl uses NFA, because it offers the most capabilities (think of it as the more "general purpose, one size fits all" solution). It's one big weakness is that it is susceptible the excessive backtracking if you don't form the expression properly. Understanding the inner workings of the engine helps a LOT when it comes to actually writing a regex that does what you want efficiently. But it's not hard to craft a regex that intentionally exploits a known weakness in a particular engine. I think that putting a text box on a web page and letting users enter their own regex is actually a HUGE security risk. It would be much better to generate a few canned expressions for common things like "begins with", "ends with", "contains", etc... and then allow the user to enter some alphanumeric text (and you'll have to filter out any non-alphanumerics), pick a "match recipe", and then generate a sane regex behind the scenes.
You could, for this particular example, use a DFA engine instead. It would be faster (due to no backtracking), but much more limited in the types of regexs you could write.
Perl's regex engine is one of its main strengths; you wouldn't want to reduce it's capabilities.
A regex engine is a pretty complex computer science topic. Putting an "anything goes" text box on a webpage and making it idiot proof is a pretty difficult problem. I would think google would have smarter developers on staff... but maybe that explains why my search results have been so poor lately? ;-)
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.