angelfish has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

for the life of me I can't figure this out. Say I have the following @array:

order_id, time_mins 096000BN, 32 096000BP, 85 096000BG, 132 096000Be, 85 096000BP, 32
& etc.......

I would like to perform a match on array value at position 0 i.e.: $array[$i][0] and if the match found find a difference between values in corresponding position 1. So basically, I want to find a match for values in column order_id and if there is a match calculate the difference in time_mins.

Should I use hash for this? The number of lines in array is extremely large. Thanks for your help!!!

Replies are listed 'Best First'.
Re: Comparing elements of array...
by b4swine (Pilgrim) on Oct 28, 2007 at 00:28 UTC
    Hello,

    For the life of me, I can't figure out what you meant to write. I presume that is because you formatted what you wrote nicely in your window, but did not include the tags that make paragraphs and separate code from text. So all of it became one long line.

    You can try and re-edit your question using the tags to make your question more readable (See Writeup Formatting Tips). Trying to guess what you may have intended, I believe that perhaps there were two columns, something like:

    order_id, time_mins 096000BN, 32 096000BP, 85 096000BG, 132 096000Be, 85 096000BP, 32 & etc.... ...
    I would like to perform a match on array value at position 0 i.e.: $array[$i[0]] and if the match found find a difference between values in corresponding position 1.

    I guess that you mean: where there are matches in column 1, you want to see the time difference. Yes hashes would be the way to go here. In each hash position, keep an array of times. For example if we had this data in a file, we could read the file and do:

    my %hash; while (<>) { my ($id,$time) = split /,/; push @{$hash{$id}}, $time; } for (keys %hash) { print "@{$hash{$id}}\n"; }
    This will get you started. You can then do the differences as you like. If you want to know the position of the matches, you need to do some extra.
Re: Comparing elements of array...
by graff (Chancellor) on Oct 28, 2007 at 04:11 UTC
    Welcome to the Monastery. You can update your post to make it more legible: put "<code>" on a line above "order_id, time_mins", and then put "</code>" on the line after the "etc...".

    As for your question: it's likely that using a hash of some sort will be useful, but it's not clear what you really need. In your sample, you have two instances of "096000BP" as the "order_id", with time_mins values of "85" and "32". Do you want an output like "096000BP, 53"? What if that order_id value occurs a third (fourth...) time -- what sort of differences should be reported then?

    Depending on what you really want, a simple hash might do, or you might need a hash of arrays (for each order_id value, store the list of distinct time_mins values). Something like this will load such a structure, and check for cases where a given hash key occurred multiple times in the data:

    my %orders; while (<DATA>) { my ( $id, $mins ) = split /[\s,]+/; push @{$orders{$id}}, $mins if ( $id ); } for my $id ( keys %orders ) { if ( @{$orders{$id}} > 1 ) { my @times = @{$orders{$id}}; # do something with @times } } __DATA__ 096000BN, 32 096000BP, 85 096000BG, 132 096000Be, 85 096000BP, 32
    I don't know what you mean by "extremely large". If it's a matter of hundreds of millions of data rows, you might need some sort of "dbm" file as your hash (see AnyDBM_File), or maybe it's time to use a real database.
Re: Comparing elements of array...
by TOD (Friar) on Oct 28, 2007 at 00:21 UTC
    what if you try formatting you post?
    --------------------------------
    masses are the opiate for religion.
      (This'll cost me karma, but I can't resist)

      Why don't you try editing you(r) post for grammar? :)

      The next 2 top-level responses are much better examples of new user guidance - they point out that the post needs editing, as you did, but they also point out HOW to edit the post. PM markup isn't completely intuitive...


      Mike