in reply to comparing numbers from previous lines in a file?
Perhaps a better approach would be to find the "mode" of all the values, i.e. the most common value, and then compare each value with that.
Update: You can ignore all of this, "mode" is not the way to go
With thanks to Mastering Algorithms with Perl - Chapter 15: Statistics - for the mode() and crazy odd_median() functionsmy @lines; my @full_lines; while (my $line = <DATA>) { chomp $line; next unless $line; my @cols = split(" ",$line); push(@full_lines,$line); push(@lines,$cols[3]) } my $mode = mode(\@lines); print qq|Mode is $mode\n|; for my $i (0 .. $#lines) { if ( abs($lines[$i] - $mode) > .5 ) { print qq|$full_lines[$i] is deviant\n|; } } sub mode { my $array = shift; my (%count,@result); foreach(@$array) { $count{$_}++ } foreach(sort { $count{$b} <=> $count{$a} } keys %count) { last if @result && $count{$_} != $count{$result[0]}; push(@result,$_); } return odd_median(\@result); } sub odd_median { my $array = shift; my @ary = sort @$array; return $ary[(@ary - (0,0,1,0)[@ary & 3]) / 2]; } __DATA__ A15 26.62 765 27.30 4.3 A11 26.63 763 27.28 4.2 A12 26.68 767 27.29 4.3 A16 26.64 768 27.30 4.2 A11 26.62 761 27.31 4.1 A15 26.62 765 27.30 4.3 A15 26.63 763 27.28 4.2 A16 26.68 767 2.29 4.3 A17 26.64 768 27.30 4.2 A18 26.62 761 27.31 4.1
Update: My suggestion also falls foul of Grandfather's altered data. However this could perhaps be solved by breaking up the lines into batches of four (one step at a time), and testing each batch.
Update 2: Actually no, breaking them into batches still doesn't work on that, so now I'm going to question if Grandfather's altered data is actually a possible scenario for the OP.
|
|---|