in reply to Re^2: Compare fields in a file
in thread Compare fields in a file

Can it be safely assumed that the data file will already be in time order? If so, you can process the lines second by second, accumulating the lines until the to-the-second resolution time changes and then processing the accumulated lines to find the one with the largest amplitude. You do not say what you want to do when more than one line has the maximum amplitude.

use strict; use warnings; # Skip headings line(s). my $discard = <DATA> for 1 .. 1; my $currentTimeStr = q{}; my @currentLines = (); while( <DATA> ) { my $timeStr = ( split m{,} )[ 3 ]; $timeStr =~ s{\..*}{}; if( $timeStr ne $currentTimeStr ) { processLines( @currentLines ) if @currentLines; $currentTimeStr = $timeStr; @currentLines = ( $_ ); } else { push @currentLines, $_; } } processLines( @currentLines ); sub processLines { my @sortedLines = map { $_->[ 0 ] } sort { $b->[ 1 ] <=> $a->[ 1 ] } map { [ $_, ( split m{,|\n} )[ -1 ] ] } @_; print $sortedLines[ 0 ]; } __END__ X,Y,Z,Time,Amplitude 2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:08.151,1 2550,531,66,10-12-2007 07:03:09.069,1 2549,529,62,10-12-2007 07:03:09.151,2

The output.

2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:09.151,2

I hope this is useful.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^4: Compare fields in a file
by Not_a_Number (Prior) on Feb 10, 2009 at 19:19 UTC

    If I add a few events to the input, this breaks. Try it with these data:

    2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:08.151,1 2550,531,66,10-12-2007 07:03:09.069,1 2549,529,62,10-12-2007 07:03:09.151,2 2550,531,66,10-12-2007 07:03:10.001,6 2550,531,66,10-12-2007 07:03:11.099,7

    The output:

    2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:09.151,2 2550,531,66,10-12-2007 07:03:10.001,6 2550,531,66,10-12-2007 07:03:11.099,7

    Ignore. I misread the specs.

      Perhaps I've misunderstood the requirement but that output looks as I would have expected.

      • Two events at time 07:03:08, event with amplitude 2 chosen.

      • Two events at time 07:03:09, event with amplitude 2 chosen.

      • A single event at time 07:03:10, the only event (with amplitude 6) chosen.

      • A single event at time 07:03:11, the only event (with amplitude 7) chosen.

      So far as I can tell, that satisfies the requirement "I'd like to keep only the largest magnitude within each second" to the letter.

      Perhaps you could explain further in which way it is "broken."

      Cheers,

      JohnGG

        Perhaps I've misunderstood the requirement (...)

        Oops, no, it seems that it's me who completely misread the OP's requirements.

        Sorry :(