in reply to Re^5: search array for closest lower and higher number from another array
in thread search array for closest lower and higher number from another array

Well I ran 10 iterations on the benchmark as you saw. I have also done greps on fresh data files (which were certainly not cached) and the difference in speed was very small (~1%) compared with subsequent searches. I do believe grep is far faster than even running data files through an empty loop. I definitely wish there was a better pure Perl solution.

This is of course because I'm using grep as a "dumb" tool to get context around the match. Then I feed this data into Perl where the true parsing is done (to remove irrelevant lines I don't wish to see). If I could do everything inside a Perl loop I would imagine it would be more efficient. In this case however Perl needs to find the line with the data header before the match, and continue after the match until the next header. I just haven't found a better way than "pre-searching" the file with grep. It's fast enough, but could it be faster? :D I'm turning into an efficiency addict now.
  • Comment on Re^6: search array for closest lower and higher number from another array

Replies are listed 'Best First'.
Re^7: search array for closest lower and higher number from another array
by flexvault (Monsignor) on Feb 07, 2011 at 14:42 UTC

    bigbot

    I don't think anyone here would tell you that a perl script could/would be faster than a "highly" optimized C program ( I could be wrong ). And there is nothing wrong with using the system command 'grep' to produce your solution.

    But if you still want to improve the execution time, some things to look at:

    • In your test script you used the regex match

      $matchCount++ if ($_ =~ /$string/o);

      however in this case, a simple "index" test should be faster

    • If you read blocks of data ( read or sysread ) you would decrease the difference between grep and perl, but now the complexity of you script has increased.
    All of these improvements are nice, but is it worth spending the time to debug the improved script to gain 'nn' seconds at execution time. That is your call -- balancing your time efficiency versus the script efficiency. Chapter 24 of The Camel book does an excellent job of explaining the trade-offs.

    Good Luck

    "Well done is better than well said." - Benjamin Franklin

      Thank you Flex I will certainly look into that book!