in reply to perl to remove duplacate based on columnb,d &
You could try something like this:
#!/usr/bin/perl use warnings; use strict; my %uniq; while (<DATA>) { next unless /^\s*\d/; chomp; my $line = $_; my @f = split /,/, $line; my $key = $f[1].$f[2].$f[3]; if ( exists $uniq{$key} ) { my $stored = ( split /,/, $uniq{$key})[4]; my $new = $f[4]; if ($new lt $stored) { $uniq{$key} = $line; } } else { $uniq{$key} = $line; } } print $_."\n" for (values %uniq); __DATA__ 1,ken,james,smith,s 11,ken,james,smith,f 0,ken,james,smith,s 5,ken,arthur,wesson,g 7,ken,arthur,wesson,a
For the provided DATA section, it produces the following output:
11,ken,james,smith,f 7,ken,arthur,wesson,a
Which should be the behavior you want.
Consider looking at dedicated CSV modules, like Text::CSV_XS.
It's already 2015 in my time zone, and so I wish you all the best in 2015. May your code produce the output you desire, and your input be as you think it is.
- Luke
|
|---|