Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

comparing multiple lines in an array.

by optiontrader (Novice)
on Oct 15, 2002 at 15:40 UTC ( [id://205403]=perlquestion: print w/replies, xml ) Need Help??

optiontrader has asked for the wisdom of the Perl Monks concerning the following question:

newbie asks: i have a multi dimensional array:
@AoA = ( [ "AAA", "BUY", "98", "0" ], [ "BBB", "SEL", "27", "1" ], [ "FFF", "BUY", "43", "4" ], [ "AAA", "SEL", "98", "0" ], [ "CCC", "SEL", "98", "0" ], );
Question: I'd like to have code that compares the first line against every other line in the the set, then moves to the second line and compares it to every other line in the set, and so on. My goal is to be able to find rows where element 0,2, and 3 in one array match element 0,2, and 3 in another array but element 1 is different in the 2 arrays. thanks much

Replies are listed 'Best First'.
Re: comparing multiple lines in an array.
by Abigail-II (Bishop) on Oct 15, 2002 at 16:00 UTC
    Comparing all elements with all elements would be quadratic. Here's a solution that runs in time O (n log n + k) where n is the number of rows and k the number of pairs for which the fields 0, 2 and 3 are equal.

    This could still be inefficient, k could be n^2/2 but still nothing is reported because we only need to report the pairs for which field 1 differs. OTOH, if it's known that if the fields 0, 2, and 3 are equal that then field 1 is different, the algorithm used is optimal:

    use strict; use warnings; my @AoA; while (<DATA>) { push @AoA => [split]; } my @sort = sort {$a -> [0] cmp $b -> [0] || $a -> [2] <=> $b -> [2] || $a -> [3] <=> $b -> [3]} @AoA; foreach my $i (0 .. $#sort - 1) { foreach my $j ($i + 1 .. $#sort) { last unless $sort [$i] -> [0] eq $sort [$j] -> [0] && $sort [$i] -> [2] == $sort [$j] -> [2] && $sort [$i] -> [3] == $sort [$j] -> [3]; next if $sort [$i] -> [1] eq $sort [$j] -> [1]; print "[@{$sort[$i]}] and [@{$sort[$j]}]\n"; } } __DATA__ AAA BUY 98 0 BBB SEL 27 1 FFF BUY 43 4 AAA SEL 98 0 CCC SEL 98 0

    Abigail

Re: comparing multiple lines in an array.
by broquaint (Abbot) on Oct 15, 2002 at 16:03 UTC
    The spiffy Algorithm::Diff module should help you out here (at least with the differences)
    my @AoA = ( [ "AAA", "BUY", "98", "0" ], [ "BBB", "SEL", "27", "1" ], [ "FFF", "BUY", "43", "4" ], [ "AAA", "SEL", "98", "0" ], [ "CCC", "SEL", "98", "0" ], ); use strict; use warnings; use Algorithm::Diff qw(diff); use Data::Dumper; for (0 .. $#AoA - 1) { print Dumper( diff(@AoA[$_, $_ + 1]) ); }

    HTH

    _________
    broquaint

Re: comparing multiple lines in an array.
by cluka (Sexton) on Oct 15, 2002 at 21:22 UTC
    The above solutions seem a bit complex for what you're doing. How about...

    my %dups; $dups{ join('~',@$_[0,2,3]) }{$_->[1]}++ for @AoA;
    The second line does all the work, storing duplicated keys into the hash. If any of the 0th, 2nd or 3rd array elements contain '~' you'll need to change the delimiter in the join method to avoid keying errors in %dups, but I'm guessing from your data that you're sticking to number and buy/sell directives.

    Outputting the data is fairly trivial, and by the looks of things this code only has to pass through the array a single time. I'll leave the implementation details for homework...
Re: comparing multiple lines in an array.
by rdfield (Priest) on Oct 15, 2002 at 15:47 UTC
    That seems like a pretty detailed description. So, all we need to see now is your attempt at implementation. Then we can help you.

    rdfield

Re: comparing multiple lines in an array.
by George_Sherston (Vicar) on Oct 15, 2002 at 15:50 UTC
    There's some good stuff about pulling out duplicate elements in an array here - you could probably adapt any of these solutions for your case, perhaps by using a hash of hashes of hashes.

    § George Sherston
Re: comparing multiple lines in an array.
by hotshot (Prior) on Oct 15, 2002 at 16:04 UTC
    I think you can start from this:
    for (my $i = 0; $i < @AoA; $i ++) { for (my $j = $i+1; $j < @AoA; $j++) { # start from the following e +ntry my $tmp = join(',', $AoA[$j]); # join each inner array to a stri +ng if ($AoA[$j] =~ /$AoA[$i][0]\,(\w+)\,$AoA[$i][2]\,$AoA[$i][3]/) { # 0, 2 and 3 are equal, check 1 if (AoA[$i][1] eq $1) { # found what we needed, do something } } } }


    Hotshot
Re: comparing multiple lines in an array.
by bprew (Monk) on Oct 15, 2002 at 20:15 UTC
    I would probaby use a hash of lists.
    Note: Please take my code with a grain of salt... I do :)
    #!/usr/bin/perl -w use strict; my %arr; my @AoA = ( [ "AAA", "BUY", "98", "0" ], [ "BBB", "SEL", "27", "1" ], [ "FFF", "BUY", "43", "4" ], [ "AAA", "SEL", "98", "0" ], [ "CCC", "SEL", "98", "0" ], ); #Walk through the array, for(my $i = 0; $i < scalar(@AoA); $i++) { # hashkey on items 0, 2 and 3. my $key = $AoA[$i][0] . $AoA[$i][2] . $AoA[$i][3]; # store item 1 in a list if(defined($arr{$key})) { my $lines = $arr{$key}; push(@$lines, $AoA[$i][1]); $arr{$key} = $lines; } else { my @lists = ($AoA[$i][1]); $arr{$key} = \@lists; } } #print out sames my @keys = keys %arr; foreach my $key (@keys) { my $options = $arr{$key}; print join(' ', @$options) . " with this info: $key \n"; }

    I'm sure there are some optimizations that can be made, so if anybody sees anything, I would love to hear feedback. Thanks
    --
    Ben

Re: comparing multiple lines in an array.
by Aristotle (Chancellor) on Oct 16, 2002 at 13:58 UTC
    Despite Abigail-II's notes, it is possible to do this in linear time. Hashes are the answer; here, you need a nested one. The following will consume some memory in exchange for runing in O(n).
    #!/usr/bin/perl -w use strict; use Data::Dumper; my @AoA = ( [ qw(AAA BUY 98 0) ], [ qw(BBB SEL 27 1) ], [ qw(FFF BUY 43 4) ], [ qw(AAA SEL 98 0) ], [ qw(CCC SEL 98 0) ], ); my %directive_for; push @{ $directive_for{$_->[0]}{$_->[2]}{$_->[3]} }, $_->[1] for @AoA; my @grouped; for my $i (keys %directive_for) { for my $j (keys %{$directive_for{$i}}) { for my $k (keys %{$directive_for{$i}{$j}}) { push @grouped, [ $i, $directive_for{$i}{$j}{$k}, $j, $k ]; } } } print Dumper(\@grouped);
    The result is an array of unique arrays where element 1 is itself an array with all the values that element had in the various non-unique copies.

    Makeshifts last the longest.

      Much appreciation for All of your Help. I have a solution that works now and several others to learn a lot from. Thanks!!!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://205403]
Approved by xtype
Front-paged by krisahoch
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-03-29 15:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found