in reply to Re: regex is too long
in thread regex is too long
I won't post the entire table because it's long and rude. However, you'll quickly get the gist.
What is special about some strings, I don't know. I spent some time trying to debug it, and when I found that I could just break the regular expression into parts (ie $evil1, $evil2, $evil3) and it started working again I didn't spend much more time on it. There was no clear pattern to me why it was going into the endless loop. I'd be happy to email you the complete code and test data which causes it to break. I'm running Perl 5.005_03 (freebsd) and wondered if upgrading to the new release of perl would fix it (I read it resolved some re bugs).$evil= ## list of re phrases t 'barely legal Unsensored pics rated adult site (find out|learn|discover) ANYTHING about anyone (remove.*\@dcemail\.com) bagboy\@burmeses\.net \(a\)\s*\(2\)\s*\(C\).*1618 this limited time free offer thousands of extra dollars earn a great monthly income \bfat absorber\b chain letter.*pyramid scheme pyramid scheme.*chain letter e-mail\w* work\w*\! Earn BIG \$\$\$ block this remove account quit watching others get rich bulk email works! firmer erections vaginal lubrication s e x drive Enhances Orgasms eraseus@yahoo.com over \d+ million fresh email content-\w+: .* .*name\s*=\s*".*\.(exe|scr|pif|vbs)" And it\'s 100% LEGAL! No Hidden Fees'; ## actually is 3 x this long $evil=~s/\n/\|/g; $evil=~s/ +/ /g; # ... skipping ahead ... while($l=&getnextline) { $_="$l$lastline"; ## combine this line and the last s/\s+/ /g; ## simplify white space matching $isSpam = $isSpam || /$evil/io; # ... etc ... }
Does that help?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re (tilly) 3: regex is too long
by tilly (Archbishop) on May 09, 2001 at 05:17 UTC |