You could remove all duplicate lines with something like
this:
my %seen;
while (<>) {
next if $seen{$_};
print;
$seen{$_}++;
}
---- amphiplex | [reply] [d/l] |
| [reply] [d/l] [select] |
You can do it in one shot, so you might as well.
That makes your second snippet
my $prev;
while (<>) {
next if ($_ eq $prev);
print $prev = $_;
}
:^)
Wait, we can shorten that..
my $prev;
while (<>) {
print $prev = $_ unless $_ eq $prev;
}
Hmm..
my $prev;
$_ ne $prev and print $prev = $_ while <>;
Err.. sorry, got carried away for a second.. Perl is just too seductive. Sigh. :-)
Makeshifts last the longest. | [reply] [d/l] [select] |
This won't do exactly as the AM wants - some lines will be duplicate to the user, but not to Perl:
$ more file.txt
N AB TX NC
AB N TX NC
FOO BAR
N AB TX NC
$ perl test.pl file.txt
N AB TX NC
AB N TX NC
FOO BAR
The first two lines of the file.txt file are "the same" to the user, but not to your program. zejames' solution works to the AM's needs, as it creates an unique key for the hash, based on the AM's definition of a duplicate.
Jason | [reply] [d/l] |
# We are modifying the $/ variable, so we limit the scope
# by adding some {} around the code
{
local $/ = '';
$^I = '.bak'; # See man perl and the -i switch for that trick
@ARGV = ('data.txt');
while (<>) {
# The order is not important, so we sort the fields to
# obtain a unique id
$sorted = join ':', sort split /\s+/;
print if (! $seen{$sorted}++ );
}
}
HTH
Update : add comments to the code
--
zejames | [reply] [d/l] |
Depending on your database setup and how you insert rows, you could also impose a unique key constraint. Failing this, you will want to sort your records and then test for equality. i.e. (warning: untested)
my %hash
while(<>){
my $key = join " ", (sort (split " "));
$hash{$key} = 1;
}
#now iterate over the keys of the hash, and either print them out, or
+do your insert in to the database
Mind you that this is feasible for small files, for certain values of small. If your file is large, you may want to just do the join line, write it to another file, and then let a sort -u do your bidding. That assumes that you are on Unix or one of its derivatives (unless there is sort for Windoze... :)
thor | [reply] [d/l] [select] |