comment on

Yes, I remember this code from: Issues with Column headings.

In the original problem statement, there was a need to check whether some id exists in file2 that does not in file1. That is why %file1 was created.

If you look at the code you posted, there are 3 main steps: (1) make the hash %file1 (ids in file1), (2) make %file2 (ids in file2), (3) process keys (all unique id's) in %file1. Step(4) process all unique ids in %file2 is not there anymore - so the data structure for it is not needed either.

So, the %file1 hash is not needed. The idea is to combine step1 and step3 together as a new step(3) and get rid of step (1) altogether.

Take out the step 1 code. And then modify step(3): instead of foreach my $id1 (keys %file1){...}, just use the first part of what was step(1):

while (my $row = $csv->getline($FILE1)) 
{
                        # $row is a reference to a row
    my @fields = @$row; # this explicitly de-references
    my $id1 = $fields[1];

    
    if (exists $file2{$id1})
    {
        $csv->print ($FILE3, "HL", @fields); #both files
    }
    else
    {
        $csv->print ($FILE3,"NOT_HK", @fields);  #file1 only
    }
}
[download]

I didn't test this, but that should give you a repeated line if an id in file1 repeats on a different line.

I do not know why you added "chomp $row;". That's not needed. $row is a reference to an array that the csv module creates when it reads the line from the file. The program won't bomb, but this line doesn't do anything useful.

In reply to Re: Matching two files based on one column common in each by Marshall
in thread Matching two files based on one column common in each by bluray

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.