in reply to Comparing elements of array...
As for your question: it's likely that using a hash of some sort will be useful, but it's not clear what you really need. In your sample, you have two instances of "096000BP" as the "order_id", with time_mins values of "85" and "32". Do you want an output like "096000BP, 53"? What if that order_id value occurs a third (fourth...) time -- what sort of differences should be reported then?
Depending on what you really want, a simple hash might do, or you might need a hash of arrays (for each order_id value, store the list of distinct time_mins values). Something like this will load such a structure, and check for cases where a given hash key occurred multiple times in the data:
I don't know what you mean by "extremely large". If it's a matter of hundreds of millions of data rows, you might need some sort of "dbm" file as your hash (see AnyDBM_File), or maybe it's time to use a real database.my %orders; while (<DATA>) { my ( $id, $mins ) = split /[\s,]+/; push @{$orders{$id}}, $mins if ( $id ); } for my $id ( keys %orders ) { if ( @{$orders{$id}} > 1 ) { my @times = @{$orders{$id}}; # do something with @times } } __DATA__ 096000BN, 32 096000BP, 85 096000BG, 132 096000Be, 85 096000BP, 32
|
|---|