in reply to Extensive array compression

You'd need to use the algorithm used by Algorithm::Diff. In fact, you could serialize each row of the array and actually use Algorithm::Diff.

Replies are listed 'Best First'.
Re^2: Extensive array compression (a solution)
by ikegami (Patriarch) on Jul 16, 2009 at 19:15 UTC
    use strict; use warnings; use Algorithm::Diff qw( ); sub parse_data { map [ (split)[1,2] ], split /^/m, $_[0] } my @old = parse_data(<<'__EOI__'); 1 b Uniq 2 f Uniq 3 c Uniq 4 a Uniq 5 e Uniq 6 c Uniq 7 a Uniq 8 b Uniq 9 d Burn 10 d Burn 11 e Burn 12 a Burn 13 f Burn 14 b Burn __EOI__ my @new = parse_data(<<'__EOI__'); 1 e Uniq 2 a Uniq 3 f Uniq 4 c Uniq 5 a Uniq 6 a Uniq 7 b Uniq 8 e Uniq 9 c Uniq 10 f Uniq 11 a Uniq 12 b Uniq 13 d Burn 14 d Burn 15 b Burn 16 d Burn 17 c Burn 18 c Burn 19 b Burn 20 d Burn 21 e Burn 22 a Burn 23 f Burn 24 b Burn __EOI__ sub serialize { return join ',', @{ $_[0] }; } my @flattened_old = map serialize($_), @old; my @flattened_new = map serialize($_), @new; my $diff = Algorithm::Diff->new( \@flattened_old, \@flattened_new ); while ( $diff->Next() ) { if ( my $same_count = $diff->Same() ) { printf("Keep %s items\n", $same_count); } else { if ( my $del_count = $diff->Range(1) ) { printf("Remove %s items\n", $del_count); } printf("Insert \"%s\"\n", $_) for $diff->Items(2); } }
    Remove 1 items Insert "e,Uniq" Insert "a,Uniq" Keep 3 items Insert "a,Uniq" Insert "b,Uniq" Keep 2 items Insert "f,Uniq" Keep 3 items Insert "d,Burn" Insert "b,Burn" Insert "d,Burn" Insert "c,Burn" Insert "c,Burn" Insert "b,Burn" Keep 5 items
      Dear ikegami,
      Thank you very much for your replay, this is some wonderful stuff.
      This will give me almost all the information needed but how can I tell which element moved where?
      Regards.
        You do three kinds of changes to your list:
        • Movements off the list (deletions)
        • Movements from off the list (insertions)
        • Movements within the list

        The program found a series of movements that if repeated, would take list A and make it list B. I don't see how the indexes would be useful, so I didn't output them. If you want them, use ->Range() in list context.

Re^2: Extensive array compression (a solution optimised to reduce data size)
by ikegami (Patriarch) on Jul 16, 2009 at 21:16 UTC
    use strict; use warnings; use Algorithm::Diff qw( LCSidx ); use Data::Dumper qw( Dumper ); sub parse_data { map [ (split)[1,2] ], split /^/m, $_[0] } my @old = parse_data(<<'__EOI__'); 1 b Uniq 2 f Uniq 3 c Uniq 4 a Uniq 5 e Uniq 6 c Uniq 7 a Uniq 8 b Uniq 9 d Burn 10 d Burn 11 e Burn 12 a Burn 13 f Burn 14 b Burn __EOI__ my @new = parse_data(<<'__EOI__'); 1 e Uniq 2 a Uniq 3 f Uniq 4 c Uniq 5 a Uniq 6 a Uniq 7 b Uniq 8 e Uniq 9 c Uniq 10 f Uniq 11 a Uniq 12 b Uniq 13 d Burn 14 d Burn 15 b Burn 16 d Burn 17 c Burn 18 c Burn 19 b Burn 20 d Burn 21 e Burn 22 a Burn 23 f Burn 24 b Burn __EOI__ sub write_compactly { my ($keep, $del, $ins, @inserted) = @_; $_ ||= '' for $keep, $del, $ins; local $Data::Dumper::Useqq = 1; local $Data::Dumper::Terse = 1; local $Data::Dumper::Indent = 0; my $inserted = Dumper(\@inserted); $inserted = substr($inserted, 1, -1); print("$keep,$del,$inserted\n"); } sub keygen { return join ',', @{ $_[0] }; } my ($oil, $nil) = LCSidx(\@old, \@new, \&keygen); push @$oil, 0+@old; push @$nil, 0+@new; my ($oi, $ni) = (0, 0); do { my $s = @$oil; while (@$oil > 1 && $oi == $oil->[0] && $ni == $nil->[0] ) { shift @$oil; ++$oi; shift @$nil; ++$ni; } my $keep = $s - @$oil; $oi += my $del = $oil->[0] - $oi; $ni += my $ins = $nil->[0] - $ni; write_compactly($keep, $del, $ins, @new[ $ni-$ins .. $ni-1 ]) } while @$oil > 1;
    ,1,["e","Uniq"],["a","Uniq"] 3,,["a","Uniq"],["b","Uniq"] 2,,["f","Uniq"] 3,,["d","Burn"],["b","Burn"],["d","Burn"],["c","Burn"],["c","Burn"],[" +b","Burn"] 5,,

    Feel free to adjust write_compactly.