Let me see if I can make this make sense with digging out a text book...

Regular expressions are implemented internally by a 'finite state automata'.

If you've never heard the term, I'll attempt to explain it... Picture a group of circles interconnected by various lines. The circles represent the current state, and the lines represent the next state to go to if a give input is seen. One circle is the start state, and some other number of circles are 'end' states (you can have more than one).

For a specific example try this:
take three circles, label them 'start', '1', and 'end'
draw an arrow from 'start' to '1', from '1' to 'end' and from 'end' to '1'
label the arrows 'a','b' and 'c' respectively.

Starting at 'start', take each character of input and follow the link with that label to the next state.
If at the end of the input you're at the 'end' state, the this automata matches the input.
If you have a character of input that you don't have a link for from you're current state, or you run out of input and aren't on a 'end' state, the the automata doesn't match the input.
Using our example the following will match: 'abc','abcbc','abcbcbc'.
And these will not: 'a', 'ab','abcb','abcd','ad', etc.
(The regular expression for this automata would be /^a(bc)+$/)

At each step the only thing the automata is concerned with is what state it is in, and what is the next character of input. The is no retained knowledge of what the previous characters were. Since finite state automatas have no 'memory' of what input they've seen before, they have no way of knowing if the correct number of ')' has been found.

Hopefully that made sense, but was probably FMTYWTK
/\/\averick


In reply to RE: Answer: What can regular expressions NOT do? by maverick
in thread What can regular expressions NOT do? by kryten

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.