in reply to Very basic question while reading a file line by line

"Very basic question ..."

Unfortunately, the question itself is too basic. You have omitted information which, if provided, would have resulted in a better answer for you.

Your input appears to be a tab-separated CSV file. Three things suggest this:

You've said nothing about the encoding of your data. I've used "UTF-8" for both input and output; you may need something else.

Your data seems very simplistic. Is what you posted truly representative of your real data?

I added an extra record to your posted input:

$ cat test_in.csv id name 123 john 34 john 567 john 11 peter 899 peter 87 helen 961 Anonymous Monk

In a normal file, with no special format defined, and to the extent that it's represented in a webpage, that last record has three fields; however, if a CSV format is specified, that last record has only two columns, just like all of the other records. Here's the CSV format revealed ('^I' represents a tab; '$' represents a newline):

$ cat -vet test_in.csv id^Iname$ 123^Ijohn$ 34^Ijohn$ 567^Ijohn$ 11^Ipeter$ 899^Ipeter$ 87^Ihelen$ 961^IAnonymous Monk$

Parsing CSV files has many gotchas. Don't try writing your own code to deal with all of these: Text::CSV has already done so; its use is highly recommended. Note that if you, or your users, have Text::CSV_XS installed, it will run faster (without requiring any change to the "use Text::CSV;" statement).

The code for performing the filtering is fairly straightforward. Here's a few notes:

#!/usr/bin/env perl use strict; use warnings; use autodie; use constant NAME => 1; my $infile = 'test_in.csv'; my $outfile = 'test_out.csv'; use Text::CSV; my %seen; { my $csv = Text::CSV::->new({ binary => 1, sep_char => "\t", quote_char => undef, }); open my $fh_in, '<:encoding(UTF-8)', $infile; open my $fh_out, '>:encoding(UTF-8)', $outfile; (undef) = scalar <$fh_in>; # skip & discard header record while (my $row = $csv->getline($fh_in)) { $csv->say($fh_out, $row) unless $seen{$row->[NAME]}++; } }

Running that gives:

$ cat test_out.csv 123 john 11 peter 87 helen 961 Anonymous Monk

Revealing CSV format:

$ cat -vet test_out.csv 123^Ijohn$ 11^Ipeter$ 87^Ihelen$ 961^IAnonymous Monk$

— Ken