in reply to Remove duplicate entries
should be:cp ($FH_A, $FH_B);
cp ($file_name_a, $file_name_b);
this works great if the search key is repeated. what if I have a key that is misspelled etc. i.e.:#!/usr/bin/perl use strict; use warnings; use autodie; my %seen; open my $FHIN, '<', $ARGV[0] or die $!; open my $FHNEW, '>', "$ARGV[0].new.csv" or die $!; open my $FHDEL, '>', "$ARGV[0].deleted.csv" or die $!; foreach my $line (<$FHIN>){ my ($key, $rest) = split/,/, $line, 2; $key =~ s/ [-&_+'] / /msx; $key =~ s/ ( [a-z] ) ( [A-Z] )/$1 $2/msx; ($seen{$key}++) ? print $FHDEL "DUP, $line" : print $FHNEW "$key,$rest"; } close $FHNEW, $FHDEL;
where the first part of the name is correct but there is potentially more junk at the end of the name. is there a way to match part of the string and if part of the string matches call it a dup?___DATA___ Group Onne,Captain,Phone Number,League Pos,etc. Group Oneffdfadsf,Captain,Phone Number,League Pos,etc. GroupOneeroneouskunk,Captain,Phone Number,League Pos,etc. Group Two,Captain,Phone Number,League Pos,etc. Group Three,Captain,Phone Number,League Pos,etc.
$seen{$key} =~ m/$key+,/ ? print DUP : print NEW;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Remove duplicate entries
by kcott (Archbishop) on Nov 17, 2010 at 07:52 UTC | |
by PyrexKidd (Monk) on Nov 17, 2010 at 16:06 UTC | |
by kcott (Archbishop) on Nov 17, 2010 at 17:11 UTC |