in reply to Printing out multiple array lists and more!

A couple of things I noticed with your code...

Well, here's my go at it. It's may be a bit obtuse, but it works.

#!/usr/bin/perl -w use strict; my @master_list = (); readfile("f1.lst", \@master_list); readfile("f2.lst", \@master_list); readfile("f3.lst", \@master_list); printf "%s | %s | %s\n", @{$_}[0..2] for(@master_list); sub readfile { my $filename = shift or die "Need filename.\n"; my $listref = shift; # Listed pointed to is modified in place. open my $file, "< $filename" or die "Can't open $filename: $!\n"; my $header = <$file>; my $c = 0; local $_; while(<$file>) { $listref->[$c] ||= []; # use strict doesn't like auto-viv. chomp; # Compare the new value with the first value stored in the list. # First value to be read in for any row is assumed to be # correct. All subsequent values must match that first one. unless(@{$listref->[$c]} and $_ != $listref->[$c][0]) { push @{$listref->[$c]}, $_; } else { push @{$listref->[$c]}, ' '; } ++$c; } close $file or die "Can't close $filename: $!\n"; }

HTH.

bbfu
Seasons don't fear The Reaper.
Nor do the wind, the sun, and the rain.
We can be like they are.

Replies are listed 'Best First'.
Re: Re: Printing out multiple array lists and more!
by snafu (Chaplain) on May 17, 2001 at 18:36 UTC
    First off I would like to say thank you! This code mostly works but I get some unitialized value errors toward the end and nothing seems to be coming from file2 in the output. I'll show you that in a minute. First, discussion.

    Believe it or not, the next if ( $c == 0 ) and $c++; # skip the file header worked! :) It would go to the next iteration the first time through and then it would increment $c. After that first hit though it was pretty much ignored for the duration of the script. However, I like your solution better. It is much cleaner.

    You are absolutely right about the usage of a subroutine for the repetitive read()'ing. The "$_ |" was intentional. In my example I was trying to get a list in the form "var | var | var" and so I was placing the delimiter in the array with the value for the output to be correct. However, once again, the way you do it is much cleaner.

    Indeed, I need to polish up my knowledge of arrays. Arrays of arrays are something that make my head hurt just mentioning. There are a ton of things I need to work on in Perl but I feel I am catching on fast considering this is my second month of coding in Perl. So, without further adieu...I start my questioning for learning process...

    Reference code...

    1 #!/usr/bin/perl -w 2 use strict; 3 4 my @master_list = (); 5 6 readfile("f1.lst", \@master_list); 7 readfile("f2.lst", \@master_list); 8 readfile("f3.lst", \@master_list); 9 10 printf "%s | %s | %s\n", @{$_}[0..2] for(@master_list); 11 12 sub readfile { 13 my $filename = shift or die "Need filename.\n"; 14 my $listref = shift; # Listed pointed to is modified in plac +e. 15 16 open my $file, "< $filename" or die "Can't open $filename: $ +!\n"; 17 18 my $header = <$file>; 19 20 my $c = 0; 21 local $_; 22 while(<$file>) { 23 $listref->[$c] ||= []; # use strict doesn't like auto-viv. 24 chomp; 25 # Compare the new value with the first value stored in the + list. 26 # First value to be read in for any row is assumed to be 27 # correct. All subsequent values must match that first on +e. 28 unless(@{$listref->[$c]} and $_ != $listref->[$c][0]) { 29 push @{$listref->[$c]}, $_; 30 } else { 31 push @{$listref->[$c]}, ' '; 32 } 33 34 ++$c; 35 } 36 37 close $file or die "Can't close $filename: $!\n"; 38 }
  • Question 1:
  • In line 18, my $header = <$file>; -- I assume that this strips the header of each file? If so, how?
  • Question 2:
  • In line 21, why did you local'ize $_? What benefit does this give me?
  • Question 3:
  • I understand what you are doing with the readfile() to an extent. One thing I have not quite grasped is the whole (line 23)  $something_here->[$something_else_here] statement. What is that doing?? I know that the ||= is creating a default value ( $this = "$that" ||= "this" ) but I do not understand what the '[ ]' is doing afterward. I don't understand the open brackets by themselves. I know that brackets denote an element in an array but this eludes me. What is auto-viv? :)

  • Question 4:
  • unless(@{$listref->[$c]} and $_ != $listref->[$c][0]) { push @{$listref->[$c]}, $_; } else { push @{$listref->[$c]}, ' '; }
    I have a few questions here.
    Ok, I get the unless statement's purpose. I don't get how it works. This is mostly because @($listref->[$c]} totally loses me. OTOH, I have pieced together that the ... $_ != $listref->[$c][0] is probably what is actually checking to make sure we skip the 0 element (the header) in the file (which makes me question my $header = <$file>; even more curious). Now the meat! I see push() and pop() all the time. I know the basics of what they do but I have no practical understanding of their usage. I see that you are push()'ing the value from $_ to whatever @{$listref->[$c]} is :) otherwise you make @{$listref->[$c]} = to nothing (my blank if a value does not exist)?

    So, this script is great! I am still struggling to figure what everything is doing but I am going to figure it out. Now, for the part we all hate...debugging.

    When I used the code it looked like it ran beautifully, however, I started getting errors toward the end of the run and something mysteriously eludes me. Let me give you a snapshot of my output:

    1653 |   | 1653
    1654 |   | 1654
    1655 |   | 1655
    1656 |   | 1656
    Use of uninitialized value in printf at try2.pl line 10.
    1657 | 1657 |
    Use of uninitialized value in printf at try2.pl line 10.
    1658 | 1658 |
    Use of uninitialized value in printf at try2.pl line 10.
    1659 | 1659 |
    
    One thing that seems to be wrong here besides the obvious is that there is nothing being returned in the middle of the list (f2.lst). Therefore, for everybodys' coding pleasure I am providing the lists to work with. Aren't I nice?! =P I will try and work with the code you have provided as well to see if I can learn something.

    Again, I appreciate your help!

    ----------
    - Jim

      Well, I'm not sure how you got the next if... line to work. On my machine (Perl 5.6.0), it seems to get executed as:

      next if( $c == 0 and $c++ );

      That is, the next never gets executed because it's basically saying if( $c == 0 and $c != 0 ). Frankly, I expected it to execute the next/if part first and short-cut the ++ so that it went into an infinite loop. Regardless, it seems like a very odd piece of code that is not consistant between systems. *shrug*

      My point about "$_ |" was that you added the trailing pipe character to the third file's values (ie, the third column) so you probably would've ended up with something more like: "var | var | var |" instead.

      Answer to Question 1: my $header = <$file>; does, in fact, strip off the header line. It does this by, basically, reading in the first line of the file. You obviously are familiar with the diamond operator for reading from a file. It just reads in a single line of the file (when called in scalar context) and returns it. There's special magic when it's used as the condition of a while loop (as in your code, and later in mine) that it automagically stores the returned line in $_. But that's just special. Here, we're just manually storing the returned line into $header.

      The purpose of localizing $_ is so that, if the code is called by someone who happens to be using $_ (such as within a foreach or while loop), we don't go and clobber their value in it. You should almost always do this whenever you use global built-ins in a subroutine like this (unless you're using $_ as the default iterator for a for(?:each)? loop, since for loops automatically localize $_ for ya). It's just good practice.

      The $var->[$index] notation means that $var is not actually an array but, rather, a reference to an array (kinda like a pointer in C but safer). You have to use the -> notation to dereference $var so that perl knows you mean to use $var instead of @var. You could also use the curly-brace form: ${$var}[$index] which means the same thing. Luckily, it looks like we won't have to worry about this anymore once Perl 6 comes around. I'm definately looking forward to that. ;-) Auto-vivification is where Perl tries to DWYM when you use an undefined value as an array (or hash) reference. Basically, it creates the anonymous array for you.

      As for unless(@{$listref->[$c]} and $_ != $listref->[$c][0])... Think of $listref as a two dimensional array, with $c as the row and the second index (0, in this case) as the column. So, the $_ != $listref->[$c][0] basically compares the current value ($_) to the value in the first column of the $c'th row. If there are no columns in the current row (ie, @{$listref->[$c]} == 0, or is false), or if the current value is equal to the value in the first column of that row, the push adds another column to the row containing the current value. Otherwise, the other push adds a blank value to that column.

      For more information on array references and multidimensional arrays, check out `perldoc perlreftut` and perlref.

      You're getting the undefined value warnings because second file has less lines in it than the other two files. I didn't put in any checks anywhere to account for files of differing lengths.

      As for the values not getting printed in the second column, that's because the values in file 2 don't match the values in file 1 or 3. Perhaps there was a misunderstanding of the goal. I understood you to mean that you wanted to compare the files on a line-by-line basis, comparing the lines in the file and printing the ones that were equal. Such that the following files (ignoring header lines):

      File 1 File 2 File 3 5 1 5 6 2 6 7 5 7 8 6 8

      Would, in fact, produce the following output:

      5 | | 5 6 | | 6 7 | | 7 8 | | 8

      Perhaps you meant for it to actually compare the files on a value-by-value basis. Such that the same files as above would instead produce this output:

      5 | 5 | 5 6 | 6 | 6 7 | | 7 8 | | 8

      If you meant it to be the latter, let me know and I will try to come up with a solution for that. Perhaps I can even make it more understandable. ;-) Also, I'd like to know if the data is guaranteed to be numeric and, if so, what the range is. That would make the solution a bit easier.

      Anyway, I'm glad to help. I'm sorry my code is so confusing. It's definately not the best I've ever written. :-P At any rate, HTH...

      bbfu
      Seasons don't fear The Reaper.
      Nor do the wind, the sun, and the rain.
      We can be like they are.