Hello wrkrbeee,

program run, but is voluminous, duplicated entries, expanded the file size from about 3 MB to 26 MB

With the whole file slurped into $data in one go, the line:

print OUTPUT "$cik,$form_type,$report_date,$file_date,$name,$sic,$file +size,$sb\n";

runs once per file. With local $/; commented out, each file is read line-by-line, and the output code is called once for each line. No surprise, then, that the output file increases dramatically in size!

If you’re going to read each file line-by-line (as you should), you’re going to have to change the logic of the code accordingly. Exactly what the new logic should be depends on what the output file is supposed to look like. Note that, at present, each of the variables $cik, $form_type, etc., is initialised once per file, but if the file is read line-by-line, these variables are all reset (without being re-initialised) each time a line is read.

It will help a lot if you can provide a small amount of sample input, together with the corresponding output you wish to obtain.

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,


In reply to Re^3: Read one line at a time by Athanasius
in thread READ one line at a time by wrkrbeee

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.