Doesn't the presence of just one false hit exclude a document?
If so, the simplest optimisation might be remove the /g;
Otherwise, if you'd just left the regex running from the point where you posted your question, until you posted your follow-up, you would have processed a little under 1 million documents of the size of those you've linked.
Whilst it may be possible to hand-optimise the supplied regex to cut runtime, you'd then be faced with having to do it all again for the next set of false matches.
Spread your load across the 4 cores of a typical current machine and you can cut your processing time to a 1/4.
Purchase a $100 of Amazon's EC2 time and cut your processing time to 1/100th or less.
In reply to Re^2: Help with speeding up regex
by BrowserUk
in thread Help with speeding up regex
by eversuhoshin
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |