in reply to Fast sublist generation

Two ideas:

  1. The sort() is probably the most expensive thing you are doing. With the code snippet you've provided, there doesn't seem to be any need for it. If you really need the sort, sort the matching results rather than the input list of keys.
  2. Are you really doing anything with the (.*) portions of the regexen ? Reducing the regex to /^$combine/ would speed it up ( and the other one to /$combine$/ ).

Replies are listed 'Best First'.
Re: Re: Fast sublist generation
by PetaMem (Priest) on Jul 28, 2001 at 19:04 UTC
    Consider the code snippet being a snippet, not the
    full code. Actually it looks something like that:
    foreach my $morph (sort keys %$self) { if($key =~ /^$morph(.*)$/) { print "Found (prefix $morph): $1\n"; return ($self->find('DE',$1), "P:$morph"); } if($key =~ /^(.*)$morph$/) { print "Found (suffix $morph): $1\n"; return ($self->find('DE',$1), "S:$morph"); } }
    And yes, the sort wouldn´t be necessary at the moment,
    but if I could manage to utilize sort by cutting the sorted
    list to a sublist (roughly 1/30 of the searchspace), then
    the speedup would be an order of magnitude.

    Ciao

      Well, you could reduce the input sort list, but at the cost of more comparisons. In this case, only sorting keys that contain the desired $string. Something like:
      foreach my $key (sort grep(/$string/,keys %hash)) { if($key =~ /^$string(.*)$/) { blah; } if($key =~ /^(.*)$string$/) { blah; } }
      This seems to speed things up for large hashes provided the matching list is a pretty small subset of the input.

      Update: Since you are return()ing the first time you find a match, the sort() is doing more work than you need. You really need a min() function. There's a node that discussed various ways to implement a min().