in reply to Having a regex error

I suspect you're confusing regexen with shell patterns. Hence * should become .* or
s/\*/.*/; # ;-)

But said this, weren't you told not to try and parse HTML yourself?

Replies are listed 'Best First'.
Re^2: Having a regex error
by Anonymous Monk on Oct 04, 2005 at 17:03 UTC
    I did get it fixed, I forgot the .* (yes I feel stupid and it's been a while). I was told NOT to use regexes to parse HTML a number of times but I generally pull through it okay. No idea what the issue is this time around but it won't pull back any data between the two sections of code listed above. I even applied /gix and it still comes back empty.

    I know some of the / were useless but I had to try to see if I could figure out what's wrong.. Anything in the code that might be the problem for not matching anything?

      "I even applied /gix and it still comes back empty."

      That doesn't surprise me. Do you know what /g, /i and /x do? The /x is certainly going to stop your regex from working, since it ignores whitespace, which is a crucial part of your regex.

      I'm guessing the problem is that the text includes newlines in between the two tags you're matching between, and the . metacharacter does NOT match newlines UNLESS you add the /s modifier to the regex.

      Just tacking modifiers on all willy-nilly does not make a regex work.


      Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
      How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
      Greetings again,
      One other thing to try is the use of the

      's' => "single-line"
      vs.
      'm' => "multi-line"

      switches at the end of your match, which basically deal with newlines differently. In short 's' treats newlines differently (so they get treated as if they are a part of '.') and 'm' does not. So you might want to try
      m/\Q<td align="left" valign="bottom">\E(.*?)\Q<center><form action='gp +ost.phtml' method='post'>\E/s #Notice the 's' at the end here.


      -InjunJoel
      "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo
        You're confused about /s and /m. The ONLY thing the /s modifier does is allow . to match newlines. The ONLY thing the /m modifier does is allow ^ to match after newlines and $ to match before newlines.

        Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
        How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
      I was told NOT to use regexes to parse HTML a number of times but I generally pull through it okay.
      Don't mind. Every now and again I do it myself... although I guess I should be supposed not to say so here!! So pssst...
      I know some of the / were useless but I had to try to see if I could figure out what's wrong.. Anything in the code that might be the problem for not matching anything?
      Hmmm... do you mean \ (a.k.a. backslashes)? Anyway, no: at first sight I don't see anything wrong. But I'm tired, and I suppose that an HTML-extracting module would do better than me!