sas429s has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to read a file as an multi dimensional array. The file looks like this:
maps_c7 190_520 3195927 00 40 0K8632 maps_c7 210_520 3195928 00 41 0K8701 maps_c7 210_620 3195929 00 42 0K8702 maps_c7 230_560 3195930 00 43 0K8635 maps_c7 230_620 3195931 00 44 0K8703 maps_c7 230_660 3195932 00 45 0K8704 maps_c7 250_660 3195933 00 46 0K8638 maps_c7 250_800 3195934 00 47 0K8705 maps_c7 275_800 3195935 00 48 0K8640 maps_c7 300_860 3195936 00 49 0K8706 maps_c7 300_860_ER 3195937 00 50 0K8642 maps_c7 330_860_ER 3195938 00 51 0K8643 maps_c7 350_860_ER 3195939 00 52 0K8707 maps_c7 300_860_RV 3195940 00 53 0K8645 maps_c7 330_860_RV 3195941 00 54 0K8646 maps_c7 350_860_RV 3195942 00 55 0K8647 maps_c7 360_925_RV 3195943 00 56 0K9048
I am trying to read this as an multidimensional array. ALl the strings in each of the rows and columns are of fixed length. I would like to check if each of the values has a valid value. What I have come up with so far:
#!/usr/bin/perl use strict; use warnings; my @array; my $line=""; my @data; my @list; my $maps_dir=""; my $dir=""; my $part_no=""; my $chg_lvl=""; my $int_lock=""; my $t_spec=""; open (FILE,"$file"); @data=<FILE>; while(<FILE>) { push @list, [split/s+/]; }
I am unable to read the multidimensional array. Is it the correct way or am i missing something? I want to read it like this: $list[$x][$y]?? where $x is the number of rows and $y the number of columns?? Also say for instance:  $list[1][2]has no value how can i validate it? Just say if($list[$x][$y])?? Thanks in advance.

Replies are listed 'Best First'.
Re: Reading a file as an multi dimensional array Please help
by ikegami (Patriarch) on Feb 02, 2008 at 15:15 UTC
    In your code,
    @data=<FILE>; <-- Puts the whole file in @data while(<FILE>) <-- There's nothing left to read here.
    You simply want
    #!/usr/bin/perl use strict; use warnings; my @list; open(my $fh, '<', $file) or die("Unable to open file \"$file\": $!\n"); while (<$fh>) { push @list, [split/s+/]; }
Re: Reading a file as an multi dimensional array Please help
by stiller (Friar) on Feb 02, 2008 at 15:24 UTC
    You might find the articles on Data Structures in Tutorials interessting.

    use strict; use warnings; use Data::Dumper; # very handy to understand your data! my $datafilename = 'map.txt'; my @list; open my $DATAFILE, '<', "$datafilename" or die "$datafilename: $!"; while(<$DATAFILE>){ chomp; push @list, [split /\s+/]; } # uncomment next line to see data structure #print Dumper( @list ), "\n\n"; print $list[1][5], "\n"; # prints sixth field on second row
    You can test if an element has valid data like this:

    if ($list[$x][$y] =~ /\d+/) # is a number

    But for these data, you will have a hard time identifying missing data, because you split on spaces, unless missing data is encoded in some special way, like for example: 'n/a' .

    If some records miss data (has spaces instead of numbers) in say the third column, your split will give you one element less for the line, rather than an array where the third element is undef, which is what you want.

    If you can get at these data in another format, like tab-separated or something, that would help.

      SOrry I got it to work.. I was using the FIle handle wrongly....Thank you
Re: Reading a file as an multi dimensional array Please help
by olus (Curate) on Feb 02, 2008 at 15:21 UTC

    You got it almost right. To match a whitespace character you must use \s, so your split should look like

    push @list, [split/\s+/];
Re: Reading a file as an multi dimensional array Please help
by wfsp (Abbot) on Feb 03, 2008 at 07:38 UTC
    All the strings in each of the rows and columns are of fixed length.
    In that case it may be worth considering unpack. This way if you have a blank field represented by a space it will still work.

    I've assumed each field is separated by a space (the x). You may have to adjust the field widths to taste.

    #!/bin/perl5 use strict; use warnings; my $rec = q{maps_c7 360_925_RV 3195943 00 56 0K9048}; my @flds = unpack qq{A10xA14xA7xA5xA6xA6}, $rec; print qq{*$_*\n} for @flds;
    output:
    *maps_c7* *360_925_RV* *3195943* *00* *56* *0K9048*
    See also perlpacktut:
    two of the most misunderstood and most often overlooked functions that Perl provides
    :-)
      Hi, How can i get the index of the row and column where the data is missing or corrupt in the table?? Here is what I was trying:
      #!/usr/bin/perl #use strict; #use warnings; use Data::Dumper; my @array; my $line=""; my @data; my @list; my $maps_dir=""; my $dir=""; my $part_no=""; my $chg_lvl=""; my $int_lock=""; my $t_spec=""; my $file=""; my $FILE; my $i=0; my $num_of_columns; my $x; my $y; my $temp="No"; my $temp1; my $temp2; if ($#ARGV == -1) { print("Enter a part number file name:\n"); chomp($file=<>); } else { $file=$ARGV[0]; } if(!-e $file) { print "File:$file does not exist"; } open ($FILE,"<$file"); while(<$FILE>) { $i++; chomp; @data=split/\s+/; push @list, [split/\s+/]; $num_of_columns=@data; } #print Dumper( @list ), "\n\n"; #print "Number of rows is $i\n"; #print "Number of columns is $num_of_columns\n"; for($x=0;$x<=$i;$x++) { for($y=0;$y<=$num_of_columns;$y++) { if(!$list[$x][$y]) { $temp1=$x; $temp2=$y; } } } print $temp1;----->Here it prints the last row instead of the index wh +ere there is no data #print $list[1][5], "\n"; # print close($FILE);

        I think you might have misunderstood this paragraph from stiller's reply.

        If some records miss data (...) in say the third column, your split will give you one element less for the line, rather than an array where the third element is undef, which is what you want.

        Allow me to rephrase that:

        If some records miss data (...) in say the third column, your split will give you one element less for the line, rather than what you want, which is an array where the third element is undef.

        In other words, splitting on whitespace will cause problems if any column lacks data. That's why stiller mentions trying to get your data in another format, and wfsp suggests unpack.

        You can tell which rows in which there is one or more missing columns, because split will then return less items.

        The

        $num_of_columns=@data;
        is not correct, you want
        $num_of_columns= scalar( @data );

        Then you can do warn "line $i has only $num_of_columns\n" if $num_of_columns < $wanted_num_of_columns;

        But, there is no simple way to find out which column is missing, so you really should try the code wfsp gave you, unpack is your friend here :o)

        Try his code, and notice that you can do with a lot less variables than you use. Put the unpack into the loop where you read the file, and then you can test which, if any, of the columns that is empty. It will contain only spaces and therefore will match /^\s+$/

        ...the index of the row and column where the data is missing...
        #!/bin/perl5 use strict; use warnings; use Data::Dumper; $Data::Dumper::Indent = 1; $Data::Dumper::Sortkeys = 1; # two passes # first, load the db my @db; while (my $rec = <DATA>){ my @flds = unpack qq{A10xA14xA7xA5xA6xA6}, $rec; push @db, \@flds; } # second, iterate over it for my $i (0..$#db){ # each row my @row = @{$db[$i]}; for my $j (0..$#{@row}){ # each field if (not $row[$j]){ printf qq{row: %s col: %s is missing\n}, $i+1, $j+1;; } } } # a field has been removed from each of the last two records __DATA__ maps_c7 190_520 3195927 00 40 0K8632 maps_c7 210_520 3195928 00 41 0K8701 maps_c7 210_620 3195929 00 42 0K8702 maps_c7 230_560 3195930 00 43 0K8635 maps_c7 230_620 3195931 00 44 0K8703 maps_c7 230_660 3195932 00 45 0K8704 maps_c7 250_660 3195933 00 46 0K8638 maps_c7 250_800 3195934 00 47 0K8705 maps_c7 275_800 3195935 00 48 0K8640 maps_c7 300_860 3195936 00 49 0K8706 maps_c7 300_860_ER 3195937 00 50 0K8642 maps_c7 330_860_ER 3195938 00 51 0K8643 maps_c7 350_860_ER 3195939 00 52 0K8707 maps_c7 300_860_RV 3195940 00 53 0K8645 maps_c7 330_860_RV 3195941 00 54 0K8646 maps_c7 3195942 00 55 0K8647 maps_c7 360_925_RV 00 56 0K9048
        output:
        row: 16 col: 2 is missing row: 17 col: 3 is missing
        ...or corrupt...
        For this you will need a set of regexes to define what the fields should contain/look like. You would then validate each field against each regex.

        hth