in reply to Searching parallel arrays.

Why not do just what you said?

Sort them into a single array and retain knowledge of where they came from.

i.e. use a hash (or even an array) to map number -> [ $ref_to_source_array, $index_in_source_array ].

Then you can look for a sequence in the sorted array and, for each number in the sequence, look up where it came from.

No code, but it's straightforward I think.

Update: If the memory consumption is excessive, you could encode an identifier for the source array and the index in a single scalar, e.g. "a1:1", "a3:4".

You should then be using about 4 times your original memory, if that's too much you could destroy your original arrays and recreate them afterwards from the lookup info :-)

Replies are listed 'Best First'.
Re^2: Searching parallel arrays.
by jbert (Priest) on Dec 08, 2006 at 11:52 UTC
    And some (mediocre :-) code:
    #!/usr/bin/perl -w use strict; my @source_data = ( [ 100, 204, 312 ], [ 102, 313, 409 ], [ 205, 206, 315 ], [ 207, 210, 314 ], ); my @sorted_numbers = sort map { @{$_} } @source_data; # Using hash for the lookup. If your N numbers are fairly dense over # 0..N then use an array for reduced mem consumption my %from_whence; foreach my $source_array (@source_data) { my $index = 0; foreach my $number (@$source_array) { $from_whence{$number} = [$source_array, $index]; ++$index; } }; # Look for 4 contig values my @recent; foreach my $number (@sorted_numbers) { push @recent, [ $number, $from_whence{$number} ]; next unless scalar @recent > 4; shift @recent; check_and_handle_contig(\@recent); } exit 0; sub check_and_handle_contig { my $numbers_info = shift; my $lastnum = undef; foreach my $numinfo (@$numbers_info) { my $num = $numinfo->[0]; if ($lastnum) { return unless $num == $lastnum + 1; } $lastnum = $num; } print "Found sequence:\n"; foreach my $numinfo (@$numbers_info) { my $num = $numinfo->[0]; my $info = $numinfo->[1]; print "$num from $info->[0]:$info->[1]\n"; } }
    which prints:
    Found sequence: 204 from ARRAY(0x814bc28):1 205 from ARRAY(0x8188e94):0 206 from ARRAY(0x8188e94):1 207 from ARRAY(0x8188ed0):0 Found sequence: 312 from ARRAY(0x814bc28):2 313 from ARRAY(0x8188c30):1 314 from ARRAY(0x8188ed0):2 315 from ARRAY(0x8188e94):2
    Update: Limbic~Region++ noted that the sort in the above (more mediocre than I thought :-) code should be numeric, rather than the default sort. Update 2:And there's another bug, found by Not_a_Number, fixes below.

      Not sure why, but I get no output from your code when I use these test data:

      my @source_data = ( [ 1, 9, 17 ], [ 2, 10, 18 ], [ 3, 11, 19 ], [ 4, 12, 20 ], );
        Thanks. Looks like there are two bugs. The '1, 2, 3, 4' is being missed because the test in the loop is incorrect (due to my trying to avoid writing the test twice). The loop should be:
        foreach my $number (@sorted_numbers) { push @recent, [ $number, $from_whence{$number} ]; shift @recent if @recent > 4; check_and_handle_contig(\@recent) if @recent == 4; }
        the other sequences are missed because of the numeric sort problem L~R noticed. The sort should be:
        my @sorted_numbers = sort { $a <=> $b } map { @{$_} } @source_data;
        with these changes in, it seems to find them all. Thanks for the test cases :-)