in reply to Regex infinite loop?

As you describe it, you say it gets all the matches, and then doesn't move to the next page. That suggests the problem isn't in while($content =~ m%pattern%s). It may be in "Do something". It may be in whatever you using the get the next page. It may be a network problem. It may be a problem on the server you're fetching from. And if you hadn't ruled it out already, it may be that the pattern takes a really long time to determine there's no match.

But I have to say, you give extremely little information. It's just guessing what could be wrong with your program.

Replies are listed 'Best First'.
Re^2: Regex infinite loop?
by Ninth Prince (Acolyte) on Oct 16, 2008 at 18:31 UTC

    I understand that I have given you very little to go on. I'm thinking that it's neither a network problem or a server problem because it always hangs on the same page. Even when I change the order in which I retrieve the pages, it still hangs on the same page. So, it seems like the problem is page-specific.

    I also have a line immediately following my {Do something} block that let's me know that I've exited the match. On the pages where my code hangs, it never exits the {Do something} block.

    On the time to match part, the match happens fairly quickly whether or not there are zero matches, one match, or multiple matches.

    One thing I should probably have mentioned, but neglected to, is that I've been running this code for a while now (it runs automatically once a day). Whoever runs the website that I'm pulling from made some minor changes to the HTML, so I needed to go back and change my matching pattern. But, like I said, it matches fine on some web pages, but then hangs on others.

      Talk code. Show us the regex, and the data on which it hangs.
      But if it "hangs" does it hang in the loop? You've code showing it exits the loop, but do you know it entered the loop? What I would do is first determine where the program "hangs": print a message before fetching a page; print a message after the page was retrieved; print a message before attempting the pattern; print a message when entering the body of the while; print a message just before exiting the body; print a message when exiting the while construct.

      And to avoid buffer problems, print those messages to STDERR.