comment on

Dear Masters,

I am trying to delimit an array given a string (which later be decomposed as region marker).
Let me give the example:

$delim = 'A -3 C -4 B';
$array = 
     ['A -4 A','A -1 C','C -4 D','D -4 B','B -3 C','C -2 B','B -1 E'];
            +   *    +   *    +   *    +   *    +   *    +   *

# Note that in $array,  starting from 
# second element $array->[1] until $array->[-1], 
# the first alphabet  ( asterisks *)  is  actually 
# a repeat of the last alphabet ( plus +) 
# from the previous element.
[download]

Using $delim, I wish to identify region in $array. First of all we can decompose $delim into sub-regions.
I have a function that does this job. The decomposition looks like this:

my $delim       = 'A -3 C -4 B';
my $delim_reg   = ['A -3 C','C -4 B'];
[download]

Once we know $delim_reg, we can identify those region in $array.
There are 6 possibilities (instances) we can identify. They are:

my $delim_reg   = ['A -3 C','C -4 B'];

                    \__i__/  \__j__/ --> sub-regions


my $array   =  
   ['A -4 A','A -1 C','C -4 D','D -4 B','B -3 C',  'C -2 B','B -1 E'];
                 
     |\_____i_______/                              \__j__/|-> Inst 1
     |____________________________________________________|


             |\_i__/  \_____j________/ |-------> Instance 2
             |_________________________|
                          
                         ... till
                          

    |\_____i_______/  \_______________j__________________/ |--> Inst 6
    |______________________________________________________|
[download]

Let me try to describe the picture above. For example, Instance 1 contains two sub-regions (i and j) which correspond to each marker from $delim_reg. So in "i" a sub-region is started with A and ended with C. Similarly in "j" a sub-region is started with C and ended with B.

Thus, we define a sub-region 'A -3 C' as a collection of tuples, such that in this collection the first alphabet of the first tuple must be equal with 'A' and last alphabet of the last tuple must be equal with 'C'.

Now the main task is to come up with a function that can capture those 6 instances with $delim and $array above as input. The final result will be this ( I did it by hand):

my $VAR =  [ 

[['A -4 A', 'A -1 C'], ['C -2 B']],                            # Ins1
[['A -4 A', 'A -1 C'], ['C -4 D','D -4 B']],                   # Ins2
[['A -4 A', 'A -1 C'], ['C -4 D','D -4 B','B -4 C','C -2 B']], # Ins3
[['A -4 A', 'A -1 C', 'C -4 D','D -4 B','B -4 C'],['C -2 B']], # Ins4
[['A -1 C'], ['C -4 D', 'D -4 B']],                            # Ins5
[['A -1 C'], ['C -4 D', 'D -4 B','B -3 C','C -2 B']],          # Ins6

]
[download]

I need to keep the result in forms of AoA, so that I can later add other value into it. In other cases, $delim can contain more or less than 3 alphabets. The size of the $array may also be varied. Thus generating more/lesser sub-regions and number of instances too.

My code below is still far far from achieving the desired result above. I encounter two main difficulties at the moment:

I still couldn't capture the sub-regions and region properly
I still couldn't capture all the 6 instances of that region.

I really don't know how to go about it. Here is the code:

use Data::Dumper;
my $delim   = 'A -3 C -4 B';
my $array   =  
     ['A -4 A','A -1 C','C -4 D','D -4 B','B -3 C','C -2 B','B -1 E'];

get_delim_region($delim,$array);

sub get_delim_region{

    my ($delim,$array)    = @_;
    my $delim_reg = decomp_str($delim);

    # I'm really stuck from here.....

    my @instances;
   
    OUT:
    foreach  my $delim_rg ( @{$delim_reg}  ){
        my @delimreg;
        my $st1 = (split(" ",$delim_rg))[0];
        my $st2 = (split(" ",$delim_rg))[2];
 
        IN:
        foreach  my $i ( 0 ..  @{$array}-1  ){
            my $tr1 = (split(" ",$array->[$i]))[0];
            my $tr2 = (split(" ",$array->[$i]))[2];

            if($st1 eq $tr1){
                push @delimreg,$array->[$i];
            }
            elsif($st2 eq $tr2){
                push @delimreg,$array->[$i];
                next OUT; 
            }
            push @instances, [ @delimreg ]; 
        }     
     }
      print Dumper \@instances;
      return ;
}    

sub decomp_str{

    # This subroutine decompose a string into sub-regions i.e
    # from:  $delim       = 'A -3 C -4 B';
    # into:  $delim_reg   = ['A -3 C','C -4 B'];

    # Credit Roy Johnson - fastest

    [$_[0] =~ /(?=([a-z]\s*(?:\S+\s*){2}))\S+\s*/gi ]
}
[download]

Dear fellow monks, I humbly seek your enlighthenment in this matter. Thanks so much beforehand.

---
neversaint and everlastingly indebted.......

In reply to Identifying Delimited Regions of an Array by neversaint

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.