in reply to Re^2: Need Speed:Search Tab-delimited File for pairs of names
in thread Need Speed:Search Tab-delimited File for pairs of names

The journey to a better program (for some definition of 'better', in this case faster) begins with a program that works and that one can understand. As suggested elsewhere, the OP code is a spaghetti monster that dare not enable strictures and warnings lest it reveal a host of naughty practices and lurking bugs.

kcott's shorter and cleaner code, assuming it actually does what mnnb wants, is much more likely to be a good starting point for improvement. I haven't studied it closely, but it seems to me that the regexes, if insufficiently speedy, could fairly easily be replaced by the use of index. In any event, while the use of regexes will not improve performance, it is also unlikely, IMHO, to significantly degrade it versus index in this case. But only benchmarking will determine the trade-offs.

Update: Minor wording changes; no semantic change.

Replies are listed 'Best First'.
Re^4: Need Speed:Search Tab-delimited File for pairs of names
by wazat (Monk) on Dec 16, 2013 at 23:29 UTC

    I second the notion that regular expressions are a better choice, especially using precompiled patterns.

    I vaguely recall that a RE serach without metacharacters should be fast. There is a short statement implying this in my camel book in the Efficiency section.

    You can always do some performance benchmarking to verify.

      Yes, indeed, a RE search without meta-characters is fast. But index is still faster:
      $ perl index_regex_bench.pl Rate Regex Index Regex 5010020/s -- -23% Index 6544503/s 31% --
Re^4: Need Speed:Search Tab-delimited File for pairs of names
by Laurent_R (Canon) on Dec 17, 2013 at 09:31 UTC
    I definitely agree with you, AnomalousMonk, and my very first comment in my post above was that kcott's code was much cleaner and shorter.