I looked at this some more last night, but I never came up with a solution. I did come up with a test framework to make it easier to try things, so I'll post that.

use Test::More; my @test_data = ( [ 'set 1', 'SALMWN DE EGENNHSEN TON BOOZ EK THS RAXAB BOOZ DE EGENNHSEN TON WBHD +EK THS ROUQ WBHD DE EGENNHSEN TON IESSAI', 'SALMWN DE EGENNHSEN TON BOES EK THS RAXAB BOES DE EGENNHSEN TON IWBHD + EK THS ROUQ IWBHD DE EGENNHSEN TON IESSAI', [ 'SALMWN DE EGENNHSEN TON ', 'DE EGENNHSEN TON IESSAI ', 'EK THS RAXAB ', 'DE EGENNHSEN TON ', 'EK THS ROUQ ' ] ], [ 'set 2', 'IOUDAS DE EGENNHSEN TON FARES KAI TON ZARA EK THS QAMAR FARES DE EGEN +NHSEN TON ESRWM ESRWM DE EGENNHSEN TON ARAM', 'IOUDAS DE EGENNHSEN TON FARES KAI TON ZARA EK THS QAMAR FARES DE EGEN +NHSEN TON ESRWM ESRWM DE EGENNHSEN TON ARAM', [ 'IOUDAS DE EGENNHSEN TON FARES KAI TON ZARA EK THS QAMAR FARES DE EGEN +NHSEN TON ESRWM ESRWM DE EGENNHSEN TON ARAM ' ] ], [ 'set 3', 'PASAI OUN AI GENEAI APO ABRAAM EWS DABID GENEAI DEKATESSARES KAI APO +DABID EWS THS METOIKESIAS BABULWNOS GENEAI DEKATESSARES KAI APO THS M +ETOIKESIAS BABULWNOS EWS TOU XRISTOU GENEAI DEKATESSARES', 'PASAI OUN AI GENEAI APO ABRAAM EWS DAUID GENEAI DEKATESSARES KAI APO +DAUID EWS THS METOIKESIAS BABULWNOS GENEAI DEKATESSARES KAI APO THS M +ETOIKESIAS BABULWNOS EWS TOU XRISTOU GENEAI DEKATESSARES', [ 'EWS THS METOIKESIAS BABULWNOS GENEAI DEKATESSARES KAI APO THS METOIKE +SIAS BABULWNOS EWS TOU XRISTOU GENEAI DEKATESSARES ', 'PASAI OUN AI GENEAI APO ABRAAM EWS ', 'GENEAI DEKATESSARES KAI APO ' ] ], ); plan 'tests' => scalar @test_data; foreach my $test (@test_data) { my $name = $test->[0]; my @input = @{$test}[ 1, 2 ]; my $wanted = $test->[3]; my @result = all_new(@input); is_deeply( \@result, $wanted, $name ); }

I also refactored a little.

This code of yours (about 44 lines)...

my @substr_mat = (); my $substr_tmp1 = (); my $substr_tmp2 = (); my $start = (); my $end = (); my %map1 = (); my %map2 = (); my $m = 0; foreach my $str (sort {($substrings{$b}[1]-$substrings{$b}[0]) <=> + ($substrings{$a}[1]-$substrings{$a}[0]) || $substrings{$a}[0] <=> $s +ubstrings{$b}[0]} keys %substrings){ $substr_tmp1 = ''; $substr_tmp2 = ''; $start = -1; $end = -1; for( my $i = $substrings{$str}[0]; $i <= $substrings{$str}[1]; + $i++){ if( ! exists $map1{$i}){ $map1{$i} = 1; $substr_tmp1 .= "$s1[$i] "; if($start == -1){ $start = $end = $i; } else { $end = $i; } } } next if $start == -1; $start = -1; $end = -1; for( my $i = $substrings{$str}[2]; $i <= $substrings{$str}[3]; + $i++){ if( ! exists $map2{$i}){ $map2{$i} = 1; $substr_tmp2 .= "$s2[$i] "; if($start == -1){ $start = $end = $i; } else { $end = $i; } } } next if $start == -1; if( length($substr_tmp1) <= length($substr_tmp2) ){ $substr_mat[$m++] = $substr_tmp1; } else { $substr_mat[$m++] = $substr_tmp2; } }

Became this instead (about 35 lines)...

my @substr_mat = (); my %map1 = (); my %map2 = (); foreach my $str ( sort { ( $substrings{$b}[1] - $substrings{$b}[0] ) <=> ( $substrings{$a}[1] - $substrings{$a}[0] ) || $substrings{$a}[0] <=> $substrings{$b}[0] } keys %substrings ) { my $substr_tmp1 = ''; my $substr_tmp2 = ''; foreach my $i ( $substrings{$str}[0] .. $substrings{$str}[1] ) + { if ( !$map1{$i}++ ) { $substr_tmp1 .= "$s1[$i] "; } } next if !$substr_tmp1; foreach my $i ( $substrings{$str}[2] .. $substrings{$str}[3] ) + { if ( !$map2{$i}++ ) { $substr_tmp2 .= "$s2[$i] "; } } next if !$substr_tmp2; push @substr_mat, ( length $substr_tmp1 <= length $substr_tmp2 ) ? $substr_tmp1 : $substr_tmp2; }

...and it does the same thing, with fewer variables.

At this point, I suspect that whatever algorithm you're using just isn't doing what you expect it to do. Since I still can't tell what it's supposed to be doing, I can't be sure.

I very much suggest practicing naming your variables. %substrings never contains any strings—keys or values. @matrix is not very descriptive. I never did figure out what %map1 and %map2 were really supposed to do.

I wish I could offer more help here. Good luck with your problem.


In reply to Re: **reopened**Re: weird subroutine behavior by kyle
in thread weird subroutine behavior by flaviusm

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.