I liked Cristoforo's comments++ about the seek statements.
=========== code extract ========== $file_seek_calendario_pago[0] = tell $LoanPaymentCalendarFileHandle; } # did not find an account match with the loan payment calendar. But we +re we in a block of "found accounts"? elsif ( $found_calendario_pago[0] == 1){ if ( $printoutput[0] == 1 ) { print @found_calendario_pago . "\n"; print "did not find??? \n"; } # go back to the previous line seek $LoanPaymentCalendarFileHandle, -$file_seek_calendario_pago[0 +], 1; ========= code extract ========= $file_seek_tran_prestamo[0] = tell $LoanTransactionFileHandle; } # account not found to go back a line elsif ( $found_trans[0] == 1){ # reset flag to not found for next account in control file $found_trans[0]= 0; # go back a line, leave filehandle in same position seek $LoanTransactionFileHandle, -$file_seek_tran_prestamo[0], + 1; # get out of loop last;
First, "tell" operates on bytes, not characters. These files are opened in UTF8 mode.
seek $LoanTransactionFileHandle, -$file_seek_tran_prestamo[0], 1;
This statement says to move backwards in the file from the current position, the number of bytes into the file from when this "tell" measurement was taken. I suspect that means to the beginning of the file or near to it!

File operations like "seek" can be "expensive" in execution time" if buffers are flushed and an actual disk operation has to be performed - seeking this far backwards will definitely "do that". Then a whole bunch of I/O is required to get back to "where you where".

I would certainly consider keeping the 2 files that you use "seek" upon in memory if possible. If they are just a few hundred MB that is certainly feasible on a 32 bit machine. Of course you can build your own cache so that the seek isn't necessary, but that takes more coding that perhaps won't be used again.

On the other hand, if you have something that you are sure produces the right result and it can complete over the weekend. Then maybe what you have is "good enough" for this one time job? Consider throwing hardware at it with a 64 bit machine and a lot of memory. Just a thought.

Update: this is odd: elsif ( $found_trans[0] == 1), perhaps !=0 is right? There is no way to "seek back one line of text". You can only seek back some number of bytes, which in the case of UTF8 probably doesn't mean even number of characters. If this is an array of lines in memory, then of course this is much easier, move back an index number or perhaps unshift the line so that it can be "shifted back out again" in certain while constructs.


In reply to Re: Optimization of script by Marshall
in thread Optimization of script by JulioRD

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.