in reply to Efficient regex matching with qr//; Can I do better?

You could try out the study function. However, you'll have to refactor your code so that the outer loop is done on the texts, and of course pull the regex compiling out of there.

Assuming that %hash1 and 2 are different from $hash1_ref, this could become something like (untested):

sub slow_match { my ( $hash_ref_1, $hash_ref_2 ) = @_; # %texts, %patterns coming from global (was: %hash1, %hash2) my %matches; foreach my $pattern ( keys %$patterns_ref ) { $matches{$pattern} = qr/\b$pattern\b/; } while ( my ( $text_id, $text ) = each %texts ) { study $text; while ( my ( $pattern, $high_lvl_id ) = each %patterns ) { if ( $text =~ $matches{$pattern} ) { $$hash_ref_1{$text_id} .= ':'.$high_lvl_id; foreach my $part (split(/\s/,$pattern)) { $$hash_ref_2{$text_id} -> {$part} = 0; } } } } }

Update: added code

Replies are listed 'Best First'.
Re^2: Efficient regex matching with qr//; Can I do better?
by kruppy (Initiate) on Jul 14, 2008 at 05:34 UTC
    Again, thanks all for your suggestions. To avoid extra work I think I'll first try to upgrade to version 10.0. The only problem is that my system administrator is away for four weeks, so I'll have to wait for that... Frustrating indeed.
      I thought I had already written this, weird...

      Working along moritz' suggestions of using larger regexes, in combination with some additional pre-processing and an upgrade to v.10.0, I am now down to 2.5 minutes on my local machine. Thus I am happy, and you do not need to post any more in this thread for the sake of helping me out.

      Thanks all of you who wrote here.