comment on

Hi guys, I have this code working almost perfectly. I am reading the file "A" and looking for a match in file "B" and decrement a value from file A. My big problem is that I am doing a full scan in the file "B" and my file "A" has 75k lines and file "B" has 880 lines. Do you have any idea about how to avoid full scan? Example of two files File "A"

l100101,aaaaaaa,a_0100,loc,10,1

l100101,aaaaaaa,a_0100,loc,11,1

l100101,aaaaaaa,a_0100,loc,12,6

File "B"

l103709,bbbbbbb,c_0200,929

l100109,bbbbbbb,b_0100,442

l100107,bbbbbbb,c_0300,389

#!/usr/bin/perl

use strict;
use warnings;

$|=1;

my $filea = $ARGV[0];
my $fileb = $ARGV[1];
my $FileC = "result.csv";


open ( FA, '<', $filea) || die ( "File $filea Not Found!" );
open ( FB, '<', $fileb) || die ( "File $fileb Not Found!" );
open ( FC, ">", $FileC) || die ( "File $FileC Not Found!" );

my @B;
while ( <FB> ) {
    chomp;
    my($look, $sec, $cls, $max) = split ",";
    push @B, [$look, $sec, $cls, $max];
}

my @A;
while ( <FA> ) {
    chomp;
    my($look, $sec, $cls, $att, $idx, $qtd) = split ",";
    push @A, [$look, $sec, $cls, $att, $idx, $qtd];
}

my $i = 1;
my $j = 0;
my $k = 0;
my $count=0;
while ( 1 ) {
    # -- keep looping til nothing is modified --
    my $modified=0;
    $j = 0;
    foreach my $row ( @A ) {
        # -- loop through FileA, $j is rowcount --
        $j++;
        $k=0;
        # -- loop through FileB, $k is linecount --
        foreach my $line ( @B ) {
            $k++;
            my $idx1= @$line[0].@$line[1].@$line[2];
            my $idx2= @$row[0].@$row[1].@$row[2];
            # -- has to match on the index fields --
            if ($idx1 eq $idx2) {
                my $max = @$line[3];
                my $tot = @$row[5] -1;
                last if $count == $max;
                if ( $tot >= 0 ) {
                        #print "FileA[".$j."]: ".join(",", @$row[0],@$
+row[1],@$row[3],@$row[4],@$row[5]  )."\n";
                        print FC join(",", @$row[0],@$row[1],@$row[3],
+@$row[4],$max  )."\n";
                        $count++;

                    @$row[5]=$tot;
                    $modified = 1;
                }
             }
        }
    }
    if ((! $modified ) || ($i > 10)) {
        last;
    }
}
[download]

In reply to how to avoid full scan in file. by EBK

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.