in reply to UPDATED: Forking On Foreach Keys In Hash, Passing Hash To Sub, And Speed/Efficiency Recommendations

3.) It seems like there would be some room for improvement on this code for efficiency to speed things along, but I can't seem to find it. Any suggestions would be very appreciated!!!
Unless I misunderstood your code, you don't use %minhash for anything except creating @npanxxarray and %npanxxhash. Maybe it's possible to replace these three variables with only one? Let's see. Processing large flat files is usually done within a single loop which reads the file and does the computations, caching as little data as possible. Why not trim and count your values as you get them? Like this:
my %counts; while (<IN>) { next unless /^{.*$/; # skip non-matching lines my $min = (split ',')[1]; # get the value $counts{substr($min,0,6)} += 1; # found another one! } # at this point %counts is like your %npanxxhash, but without all the +temporary variables
(code is untested, sorry)
Counting values like this will take much less time than I/O required to read the file, so there is no need to use multi-threading here.

Aside from that, child process can't modify variables in its parent, you would need to use threads and shared variables instead.

  • Comment on Re: Forking On Foreach Keys In Hash, Passing Hash To Sub, And Speed/Efficiency Recommendations
  • Download Code

Replies are listed 'Best First'.
Re^2: Forking On Foreach Keys In Hash, Passing Hash To Sub, And Speed/Efficiency Recommendations
by ImJustAFriend (Scribe) on Aug 08, 2014 at 16:10 UTC

    aitap, you are a GENIUS!! It never occurred to me to go this route... consider me slapping my forehead. The "counts" array line was the key, and something I had not thought of! I am testing now, but it looks REAL promising!!

    Thank you SO much!!