in reply to Optimization of script
First, "tell" operates on bytes, not characters. These files are opened in UTF8 mode.=========== code extract ========== $file_seek_calendario_pago[0] = tell $LoanPaymentCalendarFileHandle; } # did not find an account match with the loan payment calendar. But we +re we in a block of "found accounts"? elsif ( $found_calendario_pago[0] == 1){ if ( $printoutput[0] == 1 ) { print @found_calendario_pago . "\n"; print "did not find??? \n"; } # go back to the previous line seek $LoanPaymentCalendarFileHandle, -$file_seek_calendario_pago[0 +], 1; ========= code extract ========= $file_seek_tran_prestamo[0] = tell $LoanTransactionFileHandle; } # account not found to go back a line elsif ( $found_trans[0] == 1){ # reset flag to not found for next account in control file $found_trans[0]= 0; # go back a line, leave filehandle in same position seek $LoanTransactionFileHandle, -$file_seek_tran_prestamo[0], + 1; # get out of loop last;
This statement says to move backwards in the file from the current position, the number of bytes into the file from when this "tell" measurement was taken. I suspect that means to the beginning of the file or near to it!seek $LoanTransactionFileHandle, -$file_seek_tran_prestamo[0], 1;
File operations like "seek" can be "expensive" in execution time" if buffers are flushed and an actual disk operation has to be performed - seeking this far backwards will definitely "do that". Then a whole bunch of I/O is required to get back to "where you where".
I would certainly consider keeping the 2 files that you use "seek" upon in memory if possible. If they are just a few hundred MB that is certainly feasible on a 32 bit machine. Of course you can build your own cache so that the seek isn't necessary, but that takes more coding that perhaps won't be used again.
On the other hand, if you have something that you are sure produces the right result and it can complete over the weekend. Then maybe what you have is "good enough" for this one time job? Consider throwing hardware at it with a 64 bit machine and a lot of memory. Just a thought.
Update: this is odd: elsif ( $found_trans[0] == 1), perhaps !=0 is right? There is no way to "seek back one line of text". You can only seek back some number of bytes, which in the case of UTF8 probably doesn't mean even number of characters. If this is an array of lines in memory, then of course this is much easier, move back an index number or perhaps unshift the line so that it can be "shifted back out again" in certain while constructs.
|
|---|