Until now I had s/.pl// in a few places that was behaving very nicely and removing the .pl file extension from a group of strings. A week or more ago, I came upon it and mentally noted that I should correct it so the dot (.) was literal since the dot as currently written was for any character before pl. I didn't correct it. Well, a few nights ago, it finally bit me and made me make the correction. So now I have a well behaved regex, s/\.pl//. The word that caused this change was Temple. Instead of getting The Temple of Doom, I got The Tee of Doom. (Don't golf with that thing, please!) When I realized what was causing my little problem, I started giggling since I had told myself that not changing it would cause problems. Then today, it bit me again elsewhere. This time the word was People. The input was The Tomorrow People, however the output was humorous. I just started giggling at both results, today's is just a bit funnier.

If you want to see my output from today...

use strict; use warnings; my $title = "The Tomorrow People"; $title =~ s/.pl//g; print $title;

This is not the first time not thinking ahead when a less specific regex has caused me some trouble. I had /ssi/ for a match in a sort subroutine. Well, I did not take into account that those three letters could appear anywhere within a word. Sure enough, those letters appeared in Crossing Jordan making that title sort incorrectly. It took me a while to figure out that one. I finally traced it back to the source of the problem, and I now have /^ssi/ which is more accurate.

This was been a little lesson to me in thinking a little ahead when writing regexes. Have you had any humorous results from a regex?

Have a cookie and a very nice day!
Lady Aleena

Replies are listed 'Best First'.
Re: Writing regexes, think ahead just a bit.
by ikegami (Patriarch) on Mar 23, 2011 at 07:24 UTC

    You could be even safer and anchor it to the end of the string.

    s/\.pl$//

      You're right. :)

      Have a cookie and a very nice day!
      Lady Aleena
Re: Writing regexes, think ahead just a bit.
by Limbic~Region (Chancellor) on Mar 23, 2011 at 19:03 UTC
    Lady_Aleena,
    This is also an example of why writing modules to store commonly used code is far better than copy/paste strategies. I know it seems silly to write a function just to perform s/\.pl$// but if it had been, you could have fixed it once and all of the other code that used it (even the code you didn't remember writing) would have been fixed as well.

    Regarding the humorous results of incorrect regexes - yes. I was writing a code that used OCR to parse bibiliographies from academic papers in the form of a PDF. I don't remember the exact details but due to the imperfect OCR process - there were a few journals that I would like to subscribe to if they were real ;-)

    Cheers - L~R

      L~R...the two regexes were in (separate) modules. After the first one, the second one was easy to figure out and find. I would love to see some of your results, but I have a feeling that they would be a bit difficult to reproduce and maybe even inappropriate. :)

      Have a cookie and a very nice day!
      Lady Aleena
Re: Writing regexes, think ahead just a bit.
by wind (Priest) on Mar 23, 2011 at 19:10 UTC

    As ikegami already pointed out, yes, anchors and boundary conditions are often the most important aspect to regex's. Second to understanding greediness, I suppose.

    Can't think of any unintentional funny parsings on my end. However, as a personal pet peeve, I always try to use File::Basename when pulling off file extensions and such. Don't always, but feel it's a good way to self document code.

    - Miller

Re: Writing regexes, think ahead just a bit.
by locked_user sundialsvc4 (Abbot) on Mar 23, 2011 at 19:17 UTC

    This is also an argument for not consuming too much beer the night before ...

Re: Writing regexes, think ahead just a bit.
by ack (Deacon) on Mar 24, 2011 at 16:17 UTC

    I just got a chance to read this and it made my whole day!

    Every now and then I experience that kind of situation; sometimes while programming and other times other things.

    Regexes can be that way for me too. Can't remember any specifics since it's been a while; but they *do* happen and I always get a chuckle or two out of it...and get gently reminded to stop and think through what I'm doing when I get just a little too robotic in certain aspects of coding.

    My most recent nemisis is that the "f" key on my keyboard is failing (I know I should get it fixed or replaced but I'm a notorious procrastinator...and poor speller) and is increasingly not generating the character when it should. Being a touch-typist I have been having to train myself to look more closely while and after I've been typing to be sure I catch it...but the results are sometimes ellusive since the results are actual words and my eye doesn't cath them as easily. Words like "fit" becomes "it", "for" become "or", "four" becomes "our", "flight" become "light" and "fear" becomes "ear". Occasionally (though I can't for the life of me remeber any off the top of my head) even become a bit obscene!

    Thanks Lady_Aleena, I needed the lightness today!

    ack Albuquerque, NM
Re: Writing regexes, think ahead just a bit.
by cavac (Prior) on Mar 28, 2011 at 17:37 UTC

    When you said "The Tee of Doom" i immediatly had to think this was a title worthy of a Terry Prattchet Novel - or at least one of his famous phrases. You know, like "Glom of Nit" from "Going Postal".

    Ok, on a more serious note (yeah...), problems like these can bite in quite unexpected ways. Here's a rather lengthy writeup of a similar well-meaning, not so well implemented "automate my language" script:

    The automated curse generator (TheDailyWTF)
      I managed to read it till the end and it is hilarious !!!.
Re: Writing regexes, think ahead just a bit.
by motokitn (Sexton) on Apr 06, 2011 at 03:40 UTC
    The main source of entertainment gained from my broken regexen is that of my coworkers watching what my face does as I watch my output break in horrible ways, generally throwing critical data on the floor. And making me twitch.