Hello Monks,
I need to parse a csv file with 50000 lines, get computer names in each line, get associated data from another inventory csv file (over 50000 lines) process it and generate output files.
My problem is the search operation is taking too long (couple of hours to complete). I changed from linear to binary search but with little success. I have tried reading the whole file to arrays and processing line by line as well.
Following is the code i use. It may be a bad logic or the way i did it. Please provide your valuable suggestions.
Joseph
open DataFile, "Inventory.csv" or die ("Inventory.csv $!"); my @Inventory = <DataFile>; close DataFile; open DataFile, "clients.txt" or die ("clients.txt $!"); while (my $line = <DataFile>) { my @fields = split(/,/, $line); my $username; my @sys; my $beg = 0; my $end = $#Inventory; my $mid = int(($beg+$end)/2); while ($beg <= $end) { @sys = split(/,/,$Inventory[$mid]); if ($fields[3] eq $sys[0]) { $company = $sys[3]; $username = $sys[1]; #remove the matched line from inventory @Inventory = (@Inventory[0..$mid-1],@Inventory[$mid+1..$#I +nventory]); last; } elsif ($fields[3] lt $sys[0]) { $end = $mid - 1; } else { $beg = $mid + 1; } $mid = int(($beg+$end)/2); } push @Data, $fields[3].",".$company.",".$fields[6].",".$fields[7]. +",".$fields[4].",".$fields[11].",".$username.",".$sys[4]; } #processing @Data after this

In reply to Data parsing - takes too long by josephjohn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.