i want a way of counting matches. in this case, it is per column of a csv file (but i've run across this doing different things). obviously, this should (and will) be in a db, but i want something strictly perl for the time being (or some xs theory might work).
so, is there a way to sort data as i match. the obvious answer is something like:
and then:$data->{ criteria }->{ $record } += 1;
@sorted = sort $data->{ $a } cmp $data->{ $b } keys %{ $data };
... but, this seems wasteful. though, i'm almost out of ideas. i thought of using an array and splice to sort as i go but, of course this is slow as hell as it moves elements.
i had a glimmer of another idea, but i'm not sure how sound it is. let me try to elaborate on it (and if it is not understandable, there's a good chance it's not very good :) ). i create the same counter hash (of course). but, then i create a separate lookup table like:
$pointer = $data->{ criteria }->{ $record }; undef $lookup->{ ( $counter - 1 } . '-' . $pointer } if( $counter > 1 +); $lookup->{ $counter . '-' . $pointer } = $pointer;
then, i could just do something like:
foreach my $key ( sort( keys( %{ $lookup } ) ) ) { print data->{ $lookup->{ $key } }; }
but, half of me feels that i'm chasing my tail here as i'm still sorting (and i'm using two data structures, obfuscating a little and yielding a prettier sort). any thoughts on this?
UPDATE btw, i recently had an idea of how to do this sort in-place and wanted to run it by y'all. i haven't gone to debugging this, but it is sort of my attempt at a proof of concept (i think it shows decently what i'm looking to do) and wanted to know if this idea was worth anything?
my $data; my $sorted; while( <> ) { my @cols = split /,/, $_; for my $i ( 0 .. $#cols ) { #counter for the unique element $data->[ $i ]->{ $cols[ $i ] }->[ 0 ]++; #undefine the array element in sorted if a $data reference was previou +sly defined undef $sorted->[ $i ]->{ \$data->[ $i ]->{ $cols[ $i ] }->[ 1 ] } if $data->[ $i ]->{ $cols[ $i ] }->[ 1 ]; #elements in $sorted my $stack = $#{ $sorted->[ $i ] }; #store the new reference to the $data record at the top of $sorted's s +tack $sorted->[ $i ]->[ $stack ] = \$data->[ $i ]->{ $cols[ $i ] }; #reference to place in $sorted so that it may be undefined later if ne +cessary $data->[ $i ]->{ $cols[ $i ] }->[ 1 ] = \$sorted->[ $i ]->[ $stack ]; } } #then, you could just loop through sorted. bypassing: # sort { $data->[ $i ]->{ $a } <=> $data->[ $i ]->{ $b } # } keys %{ $data->[ $i ] } # with something like for my $i ( 0 .. $#{ $sorted } } ) { foreach my $j ( 0 .. $#{ $sorted->[ $i ] } ) { print "column ". $i . ":" . $sorted->[ $i ]->[ $j ]->[ 1 ] . " had " . $sorted->[ $i ]->[ $j ]->[ 0 ] . " duplicates\n" if( $sorted->[ $i ]->[ $j ] ); } }
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |