in reply to Re^2: Converting Arrays into Matrix
in thread Converting Arrays into Matrix

Me again I have another problem with the script and since I frankly don't fully understand it, I ask again here:
Please look at the output of:

use strict; use warnings; use Algorithm::Diff qw( ); my @seqs = ( [qw( A B C D E F G H I )], [qw( A D C X F G H I )], # [qw( A )], # [qw( A B C )], ); my @combined; my @grid; for my $col_idx (0..$#seqs) { my $seq = $seqs[$col_idx]; my $diff = Algorithm::Diff->new(\@combined, $seq); my @new_combined; my @new_grid; while ($diff->Next()) { if ($diff->Same()) { for ($diff->Range(1)) { push @new_combined, $combined[$_]; push @new_grid, $grid[$_]; $new_grid[-1][$col_idx] = 1; } } else { for ($diff->Range(1)) { push @new_combined, $combined[$_]; push @new_grid, $grid[$_]; } for ($diff->Range(2)) { push @new_combined, $seq->[$_]; push @new_grid, []; $new_grid[-1][$col_idx] = 1; } } } @combined = @new_combined; @grid = @new_grid; } for my $row_idx (0..$#combined) { my $ch = $combined[$row_idx]; for my $col_idx (0..$#seqs) { print($grid[$row_idx][$col_idx] ? $ch : " ", " "); } print("\n"); }

It outputs

A A B C D D E C X F F G G H H I I

What I want is:

A A B B C C D X E F F G G H H I I

The rows must be, so to say, unique, meaning that there must not be a "C" in line 3 and one in line "6". What I also wonder is: The Documentation of Algorithm:Diff reads that if finds the LCS. But for my example it finds 6 common rows while I find 7 ...
Greetings,
Jan

Replies are listed 'Best First'.
Re^4: Converting Arrays into Matrix
by Ratazong (Monsignor) on Apr 27, 2011 at 10:57 UTC
    [qw( A B C D E F G H I )], [qw( A D C X F G H I )],
    Your second array does not contain an 'B', but starts with A D C. Looks like a typo.
Re^4: Converting Arrays into Matrix
by janDD (Acolyte) on Apr 27, 2011 at 18:53 UTC

    Dear community
    Well, you are right, it was a typo. However, the problem I faced is still persistant (though I was not able to describe it due to my typo). It is also hard to reproduce. This is why I would like to show you a sample of my data. Please execute the script and look at the output:

    You will see that there is the letter "V" in row number 19 AND 25 ... That is not correct, is it?

    They should all be in 25 or even 28, depending on the situation with the 3 and the U...

    I am really sorry for the amount of data, but I cannot really reproduce this behavior. With many smaller data sets (those i showed you), it worked ...
    Greetings, Jan

    #!/usr/bin/perl -w use strict; use warnings; use Algorithm::Diff qw( ); my @seqs = ( [ qw ( A B C D E Z F G H I J K L M N O V W X Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P Q V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O P V Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P Q R S T V W Y ) ], [ qw ( 3 W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O V W 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W 2 4 ) ], [ qw ( N O V Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( N O P Q R S V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( 7 2 4 ) ], [ qw ( A B C D E 1 F G H I J K L M N O P V 5 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O V W 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T U W 2 Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O V 5 ) ], [ qw ( A B C D E Z F 6 G I J K V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W X 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W Y ) ], [ qw ( A O P Q R V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V 5 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O V W 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P Q V W 4 ) ], [ qw ( A B C D E 1 F G H I J K L M 5 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F 6 G H I J K L M N O P Q R V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O P V W Y ) ], [ qw ( A B C D E 1 F G H I J K L M N O P V 5 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T U V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W Y ) ], [ qw ( 3 W 2 4 ) ], [ qw ( 3 W 2 4 ) ], [ qw ( A B C D E 1 F G H I J K L M N O P V 5 ) ], [ qw ( A O P Q V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F 6 G I J K L M N P V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P V X Y ) ], [ qw ( A B C D E 1 F G H I J K L M N O P V 5 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( N O P V W Y ) ], [ qw ( A B C D E 1 F G H I J K L M N O P V 5 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F 6 G I J K L M N O P Q R V W Y ) ], [ qw ( A O P Q V W Y ) ], [ qw ( 3 W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P V W Y ) ], [ qw ( A O P Q R V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O P V X Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T U V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O 5 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T U V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( N O P V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O P V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O V W Y ) ], [ qw ( A B C D E Z F 6 G I J K L M N P V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P Q R S T U V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O P V X Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A O P Q R V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P V W Y ) ], [ qw ( N O P V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O V W 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E Z F 6 G I J K L M N O P Q R V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O V W Y ) ], [ qw ( A B C D E 1 F G H I J K L M N O P V 5 ) ], [ qw ( A B C D E Z F G H I J K L M N O P Q R S T V W Y ) ], [ qw ( N O P V Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S W X 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P Q R S T U V W Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P V Y ) ], [ qw ( A B C D E Z F G H I J K L M N O P V W Y ) ], [ qw ( A B C D E Z F G H I J K L M N O V W 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P V Y ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T V W 2 4 ) ], [ qw ( A B C D E 1 Z F G H I J K L M N O P Q R S T W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O P Q V W 2 4 ) ], [ qw ( A B C D E Z F G H I J K L M N O V W Y ) ], ); my @combined; my @grid; for my $col_idx (0..$#seqs) { my $seq = $seqs[$col_idx]; my $diff = Algorithm::Diff->new(\@combined, $seq); my @new_combined; my @new_grid; while ($diff->Next()) { if ($diff->Same()) { for ($diff->Range(1)) { push @new_combined, $combined[$_]; push @new_grid, $grid[$_]; $new_grid[-1][$col_idx] = 1; } } else { for ($diff->Range(1)) { push @new_combined, $combined[$_]; push @new_grid, $grid[$_]; } for ($diff->Range(2)) { push @new_combined, $seq->[$_]; push @new_grid, []; $new_grid[-1][$col_idx] = 1; } } } @combined = @new_combined; @grid = @new_grid; } for my $row_idx (0..$#combined) { my $ch = $combined[$row_idx]; for my $col_idx (0..$#seqs) { print($grid[$row_idx][$col_idx] ? $ch : " ", " "); } print("\n"); }