comment on

As poj suggested, in these modern days importing into a relational database like Microsoft Access or sqlite might provide an easy SQL solution to your problem. If you want to look at your files as a text database then the problem looks like it fits an old text database processing tool: Awk. Perl has an awkish/autosplit '-a' mode suggesting:

#!/usr/bin/perl -a

use strict;
use warnings;

our %recs;
my $k = join "\t", @F[0 .. 3];
if ($recs{$k}) {
    $recs{$k}->{key_count}++;
    $recs{$k}->{rec}->[$_] += $F[$_] for 4, 5;
}
else {
    # careful to copy with [ @F ] not \@F here
    $recs{$k} = {key_count => 1, rec => [ @F ]};
}

END {
    $recs{$_}->{rec}->[4] /= $recs{$_}->{key_count} foreach keys %recs
+;
    print join("\t",@{$recs{$_}->{rec}}), $/ foreach sort {
        $recs{$a}->{rec}->[ 0 ] cmp $recs{$b}->{rec}->[ 0 ] ||
        $recs{$a}->{rec}->[ 1 ] <=> $recs{$b}->{rec}->[ 1 ] ||
        $recs{$a}->{rec}->[ 2 ] cmp $recs{$b}->{rec}->[ 2 ] ||
        $recs{$a}->{rec}->[ 3 ] cmp $recs{$b}->{rec}->[ 3 ]
    } keys %recs;
}
[download]

Ron

In reply to Re: Merging partially duplicate lines by mr_ron
in thread Merging partially duplicate lines by K_Edw

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Pathologically Eclectic Rubbish Lister
	PerlMonks