madd has asked for the wisdom of the Perl Monks concerning the following question:

I have data presented in a free format list, where the title begins on the first charater of the line and may be one or two characters, it is then followed by a series of numbers (one number per line, begins in the second column) that I need to read in as an array, before the next block begins. It isn't strictly delimited. The following is example input:
C -2.3242E-003 1.32423 0.34243E+002 ..etc... -3.23134 H more numbers
I need to get it as an array of arrays, i.e.
C -2.3242E-003 1.32423 0.34243E+002 ... -3.23134 H numbers
I don't necessarily know the length of the array and they aren't necessarily the same. I've been trying to test on the first character of the line, but it seems to produce garbage. Any help appreciated.
my $i=0; while <$input>{ if (/^[A-Z]){ $AA[$i][0] = $_; my $j=1; } else{ $AA[$i][$j] = $_; $j++; } $i++; }

Replies are listed 'Best First'.
Re: Read fortran list output
by moritz (Cardinal) on Jul 30, 2008 at 10:15 UTC
    Try this:
    use strict; use warnings; use Data::Dumper; my @matrix; while (<DATA>){ chomp; if (m/^[A-Z]/){ push @matrix, [$_]; } else { push @{$matrix[-1]}, $_; } } print Dumper \@matrix; __DATA__ C -2.3242E-003 1.32423 0.34243E+002 ..etc... -3.23134 H more numbers

    For each line that begins with an upper case letter it pushes a new array reference onto @matrix, and for all other lines it simply pushes the data onto the last array in @matrix.

    If you have to deal more with Fortran output, take a look at Fortran::Format.

      I'd use Scalar::Util::looks_like_number in addition:
      #!/usr/bin/perl use warnings; use strict; use Data::Dumper; use Scalar::Util qw(looks_like_number); my $data = []; while (<DATA>) { chomp; if (/^(\w{1,2})$/) { push(@$data, [$1]); } elsif (looks_like_number($_)) { push(@{$$data[-1]}, $_); } } print Dumper($data); __DATA__ C -2.3242E-003 1.32423 0.34243E+002 -3.23134 MD 0.34243E+002 -3.23134
        A nice addition, but you have to take care because not everything that is a number to Fortran is also a number to perl. For example Fortran has output format with a D instead of an E for the exponential.

        Which is why I lazily both recommended Fortran::Format and left the decision on what a number is as an exercise to the reader, who hopefully knows more details about the data format than I do.

Re: Read fortran list output
by apl (Monsignor) on Jul 30, 2008 at 11:26 UTC
    A good practice is to have
    use strict; use warnings;
    at the top of a Perl script. Then you would've seen that you have problems in your use of $j. (It appears to be defined solely in the scope of your initial if, and not defined in the else.)

    Other Monks have addressed a better way to solve your problem, but in the case of this specific program, you'd want to say my $j; before the while, and remove the my from the if.

    Your code would then look like

    my $i=0; my $j; while <$input>{ if (/^[A-Z]){ $j = 0; } else { $j++; } $AA[$i][$j] = $_; $i++; }
    or
    my $i; my $j; while <$input>{ $j = (/^[A-Z]) ? 0 : $j+1; $AA[$i++][$j] = $_; }
Re: Read fortran list output
by swampyankee (Parson) on Jul 30, 2008 at 11:15 UTC

    I'd suggest reading up on perlref, as Perl doesn't have multi-dimensional arrays: it has arrays of references to arrays. Could you show us the code you're using to output your results? I suspect that you may be chasing a problem in printing your AoA.

    I do find Data::Dumper to be very useful for debugging things like multi-dimensional arrays (Arrays of Arrays, or AoA), hashes of arrays, etc.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc