in reply to Re: READ one line at a time
in thread READ one line at a time

Deleted local $/ ... program run, but is voluminous, duplicated entries, expanded the file size from about 3 MB to 26 MB. Clueless here. I am grateful for your help. And I am not surprised that the code is poorly written.

Replies are listed 'Best First'.
Re^3: Read one line at a time
by Athanasius (Archbishop) on Dec 27, 2014 at 16:36 UTC

    Hello wrkrbeee,

    program run, but is voluminous, duplicated entries, expanded the file size from about 3 MB to 26 MB

    With the whole file slurped into $data in one go, the line:

    print OUTPUT "$cik,$form_type,$report_date,$file_date,$name,$sic,$file +size,$sb\n";

    runs once per file. With local $/; commented out, each file is read line-by-line, and the output code is called once for each line. No surprise, then, that the output file increases dramatically in size!

    If you’re going to read each file line-by-line (as you should), you’re going to have to change the logic of the code accordingly. Exactly what the new logic should be depends on what the output file is supposed to look like. Note that, at present, each of the variables $cik, $form_type, etc., is initialised once per file, but if the file is read line-by-line, these variables are all reset (without being re-initialised) each time a line is read.

    It will help a lot if you can provide a small amount of sample input, together with the corresponding output you wish to obtain.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Thank you! Question: there is a comment in the code about slurping the entire file into $data. I attempted to circumvent this concept by deleting: $data = <SLURP>; so, my question is whether or not the pgm is still reading the entire file. In short, the comment is misleading. :-(
Re^3: READ one line at a time
by poj (Abbot) on Dec 27, 2014 at 16:25 UTC
    Move these 3 lines down outside the while
    my $filesize = -s $direct . '/' . $file; my $sb = stat($direct . '/' . $file)->size; print OUTPUT "$cik,$form_type,$report_date,$file_date,$name,$sic,$file +size,$sb\n";
    poj
      Moved the three lines outside of the WHILE. Left local $/ in the code. Program runs. Currently using small subset of test data. Will execute on larger dataset now. May take some time to reply. Okay?
      Working now! I am forever grateful for your knowledge and expertise! Hope 2015 is the very best for you!