http://qs1969.pair.com?node_id=296915

nylon has asked for the wisdom of the Perl Monks concerning the following question:

I'm a perl newbie (and I know it) that's why I seek wisdom here.
The following problem came up: I have an array with in each array-element some data that is divided by the same words. The ("border")words are always present but the lenght and content between the words differ. I need to filter the data between the ("border")words and put them in there own array-elements.

eg:

#!/usr/bin/perl @accounts = ("A x1 B y1 C z1 D v1 E w1 F", "A x2 B y2 C zzz2 D v2 E w2 F", "A x3 B y3 C z3 D v3 E w3 F", "A x4 B y4 C z4 D v4 E wwww4 F", "A x5 B y5 C z5 D v5 E w5 F", "A x6 B y6 C z6 D v6 E F"); @fields = ("A", "B", "C", "D", "E"); @fields_plus = ("B", "C", "D", "E", "A"); $end = $#accounts + 1; $end_1 = $#fields + 1; $i = 0; $j =0; $teller = 1; foreach $jump(@accounts) { foreach $test(@fields) { $accounts_2[$j][$i] =~ (/$test(.*?)$fields_plus[$i] /sgm) ; push @accounts_2, ("$teller", "***"); $teller++ ; $i++; if ($fields[$i] == $end_2) {last} } $j++; } for ($j = 0, $j > $end, $j++) { for ($i =0, $i > $end_1, $i++) { print ("$accounts_2[$j][$i]"); } }
The result should be something as:
$accounts_2[0][0] = x1 $accounts_2[0][1] = y2 $accounts_2[][] = ... $accounts_2[1][0] = x2 $accounts_2[][] = ... $accounts_2[3][4] = wwww4 $accounts_2[][] = ...

I got some spooky stuff.
Thanks in advance for helping me :-)

Nylon

Replies are listed 'Best First'.
Re: Array in Array
by Abigail-II (Bishop) on Oct 06, 2003 at 12:37 UTC
    I'm not sure what you want (and I've no idea what you find 'spooky'), but does this do it for you?
    #!/usr/bin/perl use strict; use warnings; use YAML; my @accounts = ("A x1 B y1 C z1 D v1 E w1 F", "A x2 B y2 C zzz2 D v2 E w2 F", "A x3 B y3 C z3 D v3 E w3 F", "A x4 B y4 C z4 D v4 E wwww4 F", "A x5 B y5 C z5 D v5 E w5 F", "A x6 B y6 C z6 D v6 E F"); my @fields = ("A", "B", "C", "D", "E"); no warnings 'misc'; my @accounts_2 = map {[@{{split}}{@fields}]} @accounts; print Dump \@accounts_2; __END__ --- #YAML:1.0 - - x1 - y1 - z1 - v1 - w1 - - x2 - y2 - zzz2 - v2 - w2 - - x3 - y3 - z3 - v3 - w3 - - x4 - y4 - z4 - v4 - wwww4 - - x5 - y5 - z5 - v5 - w5 - - x6 - y6 - z6 - v6 - F

    Short explaination of the map line: Take each element of @accounts, split it on white space, make an anonymous hash out of the results, and take a hash slice of that, indexed on @fields. Create an anonymous array of the result of the slice.

    Abigail

      Abigail,
      Thanks for the help.
      I get
      ARRAY(0x253a350) ARRAY(0x2536024) ARRAY(0x2532bb8) ARRAY(0x248e328) ARRAY(0x248c1d0) ARRAY(0x2488e40)
      if I run :
      sub tweezers_sub { my @accounts_2 = map {[@{{split}}{@fields}]} @accounts; foreach $rec(@accounts_2) { print FH_temp ("$rec\n"); } # Prints all array records to the output file }
      That's what I call the "spooky stuff". :-)
      Nylon
      (PS: have installed the YAML module. Nice :-)
        Your $rec variable is actually a reference to an array. You should dereference it, for example by writing my @rray = @$rec;, or, TIMTOWTDI:
        #!/usr/bin/perl use strict; use warnings; my @accounts = ("A x1 B y1 C z1 D v1 E w1 F", "A x2 B y2 C zzz2 D v2 E w2 F", "A x3 B y3 C z3 D v3 E w3 F", "A x4 B y4 C z4 D v4 E wwww4 F", "A x5 B y5 C z5 D v5 E w5 F", "A x6 B y6 C z6 D v6 E F"); my @fields = ("A", "B", "C", "D", "E"); sub tweezers_sub { my @accounts_2 = map { [@{{/([A-Z])\s+([a-z]+\d|\s)/g}}{@fields} ] +} @accounts; for my $rec (@accounts_2) { print "'" . join("' - '", @$rec) . "'\n"; } } tweezers_sub;
        This uses a RegEx instead of a split, but you can still see how $rec is dereferenced by writing @$rec.
        Hope this helped.
        CombatSquirrel.
        Update: Mixed up referencing and dereferencing. Fixed.
        Entropy is the tendency of everything going to hell.
        That's what you are supposed to get - you are printing out a reference to an array. What did you expect to get?

        Abigail

Re: Array in Array
by Zaxo (Archbishop) on Oct 06, 2003 at 15:24 UTC

    Abigail-II's solution fails for the last string because of the empty entry E. Here's another way to parse this:

    my @accounts = ( "A x1 B y1 C z1 D v1 E w1 F", "A x2 B y2 C zzz2 D v2 E w2 F", "A x3 B y3 C z3 D v3 E w3 F", "A x4 B y4 C z4 D v4 E wwww4 F", "A x5 B y5 C z5 D v5 E w5 F", "A x6 B y6 C z6 D v6 E F", ); # Transform @accounts in place for (@accounts) { tr/ //d; # get rid of those confusing spaces my @acct = split /([A-F])/; # use the tags shift @acct; # get rid of the undef pre-A entry... pop @acct; # ... and the 'F' my %stuff = (@acct); # hash what's left $_ = [@stuff{'A'..'E'}]; # keep ordered values } { local $" = "\t"; print "|@$_|", $/ for @accounts; } __END__ |x1 y1 z1 v1 w1| |x2 y2 zzz2 v2 w2| |x3 y3 z3 v3 w3| |x4 y4 z4 v4 wwww4| |x5 y5 z5 v5 w5| |x6 y6 z6 v6 |
    The transformation of @accounts could be shortened a lot, but I wanted to show it step-by-step for clarity. Your data format is not of the handiest.

    After Compline,
    Zaxo

      Dear all,

      I have to study all of this. This is for me new stuff.
      Thanks for all the help. :-) :-) :-)
      Nylon
        PS: The border are not always [A..B], it could be  [A G L Z *  .. boe ..] etc etc.
        I used the  [A..B] as an example.
        Sorry that I was not clear.
        Nylon
        To my regret it is not working.
        The script needs to use the array (@fields) (and not the  [A..F] ) as separator. I got empty results. :-(
        It looked so simple in the beginning (when I started the script) but for a newbie it is not.
        Thx,
        Nylon
Re: Array in Array
by kesterkester (Hermit) on Oct 06, 2003 at 14:31 UTC
    You could read your data into a hash table to make things a little easier on yourself, I think:

    use warnings; use strict; use Data::Dumper; my @accounts = ("A x1 B y1 C z1 D v1 E w1 F", "A x2 B y2 C zzz2 D v2 E w2 F", "A x3 B y3 C z3 D v3 E w3 F", "A x4 B y4 C z4 D v4 E wwww4 F", "A x5 B y5 C z5 D v5 E w5 F", "A x6 B y6 C z6 D v6 E F"); my %fields = map { $_ => 1 } qw/A B C D E F/; my $cur_field; my %hash; foreach ( @accounts ) { foreach ( split ) { $cur_field = $_ if exists $fields{$_}; push @{$hash{$cur_field}}, $_ if $_ ne $cur_field; } } print Dumper \%hash; __END__ $VAR1 = { 'A' => [ 'x1', 'x2', 'x3', 'x4', 'x5', 'x6' ], 'B' => [ 'y1', 'y2', 'y3', 'y4', 'y5', 'y6' ], 'C' => [ 'z1', 'zzz2', 'z3', 'z4', 'z5', 'z6' ], 'D' => [ 'v1', 'v2', 'v3', 'v4', 'v5', 'v6' ], 'E' => [ 'w1', 'w2', 'w3', 'wwww4', 'w5' ] };