in reply to one line regex eating CPU

Replace
push(@titles, $1) while $source =~ m/<title>(.+)<\/title>/i;
with
push(@titles, $1) while $source =~ m/<title>(.+?)<\/title>/ig;
to 1) avoid matching from the begining of $source every time, and 2) to avoid matching too much.

push(@titles, $source =~ m/<title>(.+?)<\/title>/i);
also works. It might even be faster. However, it uses more memory.

Update: Actually, there should be at most one title, so you want
push(@titles, $1) if $source =~ m/<title>(.+?)<\/title>/i;

Replies are listed 'Best First'.
Re^2: one line regex eating CPU
by shmem (Chancellor) on Jun 23, 2006 at 20:18 UTC
    this ikegami's advice seems to be better than mine - don't reset the regex-engine.

      Hum? Yours "resets the regex-engine". Mine doesn't. How can you say we gave the same advice?

      >perl -wle "$_='bacada'; print pos while /a/g" 2 4 6 >perl -wle "$_='bacada'; print pos while s/a//" Use of uninitialized value in print at -e line 1. Use of uninitialized value in print at -e line 1. Use of uninitialized value in print at -e line 1.
        I know; I thought I made that clear? ikegami's advice seems to be better than mine - "don't reset the regex-engine". clearer now?