Re: An efficient way to parallelize loops

I'm surprised no one has asked you a) how many regexes there are; b) to show a few 'typical examples' of the regexes involved.
Beyond the 5.10 trie optimisation--which also has some limitations--there are other ways of optimising the use of multiple regexes against single buffers, but they do tend to vary with the nature of the regexes involved.
On the basis of what you've shown so far, it looks like this task might be effectively parallelised using threads
However, your need to stick with 5.8.0--a time when threads were still quite flakey--means I would be reluctant to suggest that solution unless you can upgrade to at least 5.8.5--though 5.8.9 or 5.10+ would be far better.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re: An efficient way to parallelize loops

Replies are listed 'Best First'.
Re^2: An efficient way to parallelize loops by Deus Ex (Scribe) on Jun 03, 2010 at 19:29 UTC
Hi BrowserUK There are plenty of regexes. Say, at least 30. Each of them made like this: `... records => [ 'MDR', 'TCBMDR', 'INSS7MDR', 'TCBINSS7MDR' ], ... for ( my $i = 0; $i < scalar @{$categories{$k}->{tracciati}}; $i++ ) { my $TestReStr = join("\|", map { "${_}" } @{$categories +{$k}->{traces}[$i]->{records}} ); $categories{$k}->{traces}[$i]->{regex} = qr/$TestReStr +/; }` [download] Note that there are many of 'records' keys. I would really like to upgrade to a higher version than 5.8.0, but it's not really possible, due to the fact that the sysadmins don't do that on this machines. Thanks for your help though	[reply] [d/l]
Re^3: An efficient way to parallelize loops by BrowserUk (Patriarch) on Jun 03, 2010 at 22:27 UTC
Note that there are many of 'records' keys.. Sorry, but "many" is not a number. 4? 40? 4e40? I would really like to upgrade to a higher version than 5.8.0, but it's not really possible, due to the fact that the sysadmins don't do that on this machines. If you have your own personal machine, the installing Perl locally is very easy to do. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply]
Re^4: An efficient way to parallelize loops by Deus Ex (Scribe) on Jun 04, 2010 at 06:36 UTC
Sorry, you're right: say about 40, which have to be checked against about some millions lines. About installing on my own machine, sure I can. The fact is that it wouldn't be useful, as this is a real work problem I have to solve at my office. Thanks for the help.	[reply]