in reply to Re^2: Perl regexp matching is slow??
in thread Perl regexp matching is slow??

Hi Russ,

I think it's a bit unfortunate how you phrased that response ("nothing useful about NFAs and DFAs"), since it doesn't make clear that you're talking about the theoretical potential of the techniques (which, indeed, my book does not cover). As such, it sounds like an attack rather than the simple citing of a fact.

You had it right in your (excellent) article:

Friedl's book teaches programmers how best to use today's regular expression implementations, but not how best to implement them.

I think a better response to Holli would be one noting that there's more potential offered by the underlying theory than is evident in the common implementations covered in the book.

Whether the underlying theory is my cup of tea is not particularly relevant to my book, because the reality of today's common implementations is that they are what they are. Perhaps you wouldn't have so much angst about my book if it were titled Mastering (what goes for) Regular Expressions ? :-)

In your article, I'm not sure what you mean when you say that I don't "respect" theory (?), but it's true that a lot of the theory behind what led to today's implementations is beyond my desire to stay awake. Here's an example of the type of thing that trying to understand makes my eyes glaze:

"An equivalence relation R over Σ* is right invariant if and only if whenever xRy, then xzRyz for all z ∈ Σ*. An equivalence relation is of finite index if and only if there are only finitely many equivalence classes under R

(from Robert Constable's paper "The Role of Finite Automata in the Development of Modern Coputing Theory" in The Kleene Symposium, North-Holland Publishing, 1980).

You know what they say: "Those who can, do, and those who can't, teach". My book teaches how to use what's there now. It would please me to no end if I could throw out all the chapters on optimization and such, and perhaps your article is a first step toward that goal. Some won't appreciate being told that the emperor has fewer clothes than has been thought, but don't let that deter you.

I do think that along the way you'll find some unexpected problems related to the semantics of capturing parentheses, for example, but heck, if you're half as good an engineer as you are a writer, I'll hold out hope, because the writing and presentation in your article is really top notch.

   Jeffrey
----------------------------
Jeffrey Friedl    http://regex.info/blog/

Replies are listed 'Best First'.
Re: Theory vs. Reality
by rsc (Acolyte) on Feb 02, 2007 at 13:28 UTC
    I wholeheartedly agree that the dry language of most finite automata papers would put anyone to sleep. It does serve an important purpose, namely making things precise enough to prove things mathematically. I certainly understand why you're falling asleep. Those papers are written for mathematicians, not programmers. ;-)

    I think the fundamental problem is the disconnect you pointed out in your blog post: when you (and, it turns out, most programmers) say DFA and NFA you mean something related to but different from what the mathematicians mean. It is true that saying things about the programming concepts, which are fuzzy and implementation-specific, is not too worthwhile, since they are always shifting, but it's not true of the mathematical ideas, which have strong mathematical proofs associated with them, telling what one can and cannot do as far as implementation strategies and efficiency.
Re: Theory vs. Reality
by ysth (Canon) on Feb 07, 2007 at 21:23 UTC
    It would please me to no end
    Nit: the common phrase is "please me no end", that is, please infinitely. "please me to no end" would mean the pleasing would be meaningless.
      ysth,
      Nit: the common phrase is "please me no end"...

      Did you intend to use the word common or did you mean to imply "more accurate"? I ask because I have always heard it please me to no end as well. I asked Google Fight which agreed. So while your correction is more accurate, I am not sure it is more common.

      Cheers - L~R

      I've never heard "please me no end", and I have no idea how to please a "me no end". I never heard of a verb that accepts two transitive clauses, and I can't think of any right now.

        I've used "please me no end" as a phrase since childhood. And it's certainly not a new usage, as I clearly remember both my Father and Grandmother using it as a part of every day speech.

        Like ysth, I never heard the "please me to no end" variation.

        I guess that's the thing about natural languages. They evolved, and continue to evolve, on the basis of common usage. The codification of the "rules of grammar", came after the fact. Hence the irregularities that burst the bubble of rule-bound.

        I clearly remember an english lesson in which I was taught that you "take things there; and bring them back", but again just this evening in a US movie I heard an apparently well educated character say: "Will you bring him with us"?

        This sounds entirely wrong to my ears, but it, and many other "grammar errors", crop up in US movies and books often enough that I can only assume that they are fairly normal speech patterns in the US. Whether they would be acceptable (to whom?), in written english is another matter entirely.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        It is actually a mispeling of "'it wood pleas Smee too know 'enned", from the pirate legend, of course. My research did turn up some cases of the "too" being dropped but I think we can all agree that such a phrase doesn't really make any sense in the context of the legend.

        I think I'll switch to "it would please me long time" to avoid confusion / controversy.

        Before doing the research that turned up the pirate legend origin, I too thought "I've never heard 'it would please me no end'" but thinking on it more I realized that I had heard that version just not much recently.

        And I really think the reason for the shift is a combination of "it would please me no end" having an awkward feel when you try to parse it grammatically (as you noted) and people finding "to no end" familiar (even though with a different meaning, especially since that other meaning is rather idiomatic anyway). So it can be more comfortable to say "it would please me to no end" when you aren't thinking about it, which is usually the case when someone utters a pat phrase. Just like it is easier to say "I could care less", avoiding that awkward "nt" sound that interrupts the flow of the phrase.

        - tye        

      Commonly I think it is actually "please me to no end" (at least in the western US :) ) with the concept being "I'll be pleased and it will never end" instead of the probably more correct "please me for no reason" interpretation you are taking. Either way its like most sayings, people know what they are suppose to mean even if the saying doesn't literally mean the right thing. Oddly "please me no end" sounds like broken English, on par with "i love you long time" (say it in an Asian accent and hopefully the humor comes across, or maybe thats just our distorted offices since of humor.


      ___________
      Eric Hodges
        Did you mean "office's" or "orifices"?