Syntenty has asked for the wisdom of the Perl Monks concerning the following question:

Hi all! This is my first time asking a question so please direct me to the appropriate place to ask questions if this isn't it.

I currently have to files that I need to extract data from and manipulate them. The final steps outputs this manipulated date to a file. The data format from the 2 file are similar. The first file's format is like:


dog23=bird209=cat25
dog33=bird211=cat45
dog53=bird26=cat67

The second file format is:

cat25=cat99
cat25=cat100
cat25=cat449
cat33=cat42
cat55=cat109

In each of these files, the dog=bird=cat are individual elements that were clumped together with a '=' sign in between them.

What i need to do, for example, using the first line ( dog23=bird209=cat25) is to split this data by removing the '=' and to put each of the three dog, bird and cat, into its own individual index.

This would then become a two-dimensional array, where each row has three columns of elements, [0][0] = dog23, [0][ 1 ] = bird209 and [0][ 2 ] = cat25.

I need to do the other same thing to the other file's data, that is split the elements and put them in a 2D array.

My task is to take each line from FILE 1, take the last element, (ie the cat#) and compare this to the first element of FILE2. If these two elements are equal, I have to put element 0,1,2 and the LAST element from FILE two in a single line and put it in single index of a new array.

So line 1 from FILE1 and Line 1 from FILE2

dog23=bird209=cat25. cat25=cat99

I'll take the last element, cat25 from the first file, scan all the lines from FILE2. Where ever I see a match, I need to append the last element of file2, ie cat99 to the first line of FILE1.

So my final output must be: IN @NEW ARRAY,
Index[0]: dog23=bird209=cat25=cat99
Index[ 1 ]: dog23=bird209=cat25=cat100
Index[ 2 ]: dog23=bird209=cat25=cat449
...etc

Here I my code:
@array = (); $NAME = "$ARGV[0]"; #FILE 1. open NAME or die "No file 0B $NAME\n"; $NAME1 = "$ARGV[1]"; #FILE 2. open NAME1 or die "No file C0 $NAME1\n"; @array = <NAME>; $cc = scalar @array; #FILE 1 array. @AoA = (); for $i ( 0 .. $cc ) { $line = <>; $line =~ s/=/ /g; chomp $line; $AoA[$i] = [ split ' ', $line ]; } #FILE 2 array. @temp = <NAME1>; $cd = scalar @temp; @AoB = (); for $k ( 0 .. $cd ) { $line2 = $temp[$k]; $line2 =~ s/=/ /g; chomp $line2; $AoB[$k] = [ split ' ', $line2 ]; } close NAME; close NAME1; @NEW = (); $x = 0; $y = 0; for $aref ( @AoA) { for $bref (@AoB) { if ($aref[-1] eq $bref[0]) { my $z = 0; foreach $AoA (@AoA) { push @{ $NEW[$y] }, $aref[$z]; $z++; } push @{ $NEW[$y] }, $bref[-1]; $y++; } else {} } } open (OUTFILE, ">V"); print OUTFILE \@NEW; close OUTFILE;
This is my code but so far it doesn't work. The last double loop is to compare if the elements are equal. If they are take line 1 from FILE ONE and append element 2 from FILE TWO. Could anyone please tell me what I'm doing wrong?

Replies are listed 'Best First'.
Re: Help needed please: Data Manipulation in 2D Array
by pc88mxer (Vicar) on May 05, 2008 at 05:27 UTC
Re: Help needed please: Data Manipulation in 2D Array
by ikegami (Patriarch) on May 05, 2008 at 05:42 UTC
    If you want to locate a string quickly from a list of strings, use a hash.
    # Assumes each cat1 appears only once in file2. my %cat2_by_cat1; while (<$fh2>) { chomp; my ($cat1, $cat2) = split /=/; $cat2_by_cat1{$cat1} = $cat2; } while (<$fh1>) { chomp; my ($dog, $bird, $cat1) = split /=/; exists( $cat2_by_cat1{$cat1} ) or die; my $cat2 = $cat2_by_cat1{$cat1}; print $fh_out (join('=', $dog, $bird, $cat1, $cat2), "\n"); }

      Hi, I'd like to thank you for your fast repsonse.

      In regards to your solution, I've refrained from using hashes because I have elements/keys that appear more than once and I need to include them all in my final output.

      For example, in file 2, the first three lines, would have the same key and different values. example:


      cat1=cat2
      cat1=cat9
      cat1=cat10
      FILE 1: dog1=bird2=cat1

      So what I would have to do, is find a way to create three lines from FILE 1, push these into a new 2D array, and then join the values to the keys, so my new array will be.
      index 0 - dog1=bird2=cat1=cat2
      index 1 - dog1=bird2=cat1=cat9
      index 2 - dog1=bird2=cat1=cat10

      The reason I cannot use a hash is because in this case it would only create the first line:
      index 0 - dog1=bird2=cat1=cat2
      since hashes cannot have identical keys for different values.

        No problem, just store more than one value per hash key. This is known as a HoA.
        my %cat2s_by_cat1; while (<$fh2>) { chomp; my ($cat1, $cat2) = split /=/; push @{ $cat2s_by_cat1{$cat1} }, $cat2; } while (<$fh1>) { chomp; my ($dog, $bird, $cat1) = split /=/; exists( $cat2s_by_cat1{$cat1} ) or die; for my $cat2 ( @{ $cat2s_by_cat1{$cat1} } ) { print $fh_out (join('=', $dog, $bird, $cat1, $cat2), "\n"); } }
Re: Help needed please: Data Manipulation in 2D Array
by GrandFather (Saint) on May 05, 2008 at 22:42 UTC

    The following coding style comments may help for future work:

    Always use strictures (use strict; use warnings;).

    You don't need to "flush" an empty variable so @array = (); is better written my @array;.

    You almost never need to interpolate a string into a double quoted string before assigning it some place. You can simply $NAME = $ARGV[0]; #FILE 1..

    No file is not the only failure mode for open. Instead write:

    open NAME or die "Failed opening $NAME: $!\n";

    @array = <NAME>; "slurps" the file which is ok for small files, or when it is essential to re-read a file, or when you want to randomly access lines in a file. Generally however you are better to:

    while (<NAME>) { ... }

    In $cc = scalar @array; you don't need scalar and really you don't even need $cc. Instead write your for loop as

    for my $i (0 .. @array) { ... }

    or even better as:

    for my $elt (@array) { ... }

    Note that I haven't addressed the "bigger picture" errors and refinements. ikegami has already done a good job of that.


    Perl is environmentally friendly - it saves trees