I am using PDL in some modules, and I want to import numerical data from a text file using the rcols function found in PDL::IO::Misc. The rcols function will import the text file into piddles that correspond to the columns of the text file.
I stumbled across an issue when I was trying to import data from a tab-delimited text file. Some positions in my input data file will contain blank entries. It seems that handling of blank entries is inconsistent. If a blank entry is in the last column of the file then the $PDL::undefval is used in the piddle. If a blank appears elsewhere, then it appears that a value of "0" is used in the piddle.
Here is an example.
The data_missing.txt file is tab-delimited but contains some blank entries.1 6 11 2 7 12 3 8 13 4 9 14 5 10 15
I use the following script to test the contents of the pdls created by rcols:1 6 11 2 7 3 8 13 4 14 5 10 15
The output for <data.txt> is:#!/usr/bin/env perl use strict; use warnings; use PDL; use PDL::IO::Misc; my $file_name = shift; die 'No file given.' unless defined($file_name); open(my $fh, '<', $file_name) or die "Can not open file: $!"; my @pdls = rcols $fh, { COLSEP => "\t" }; foreach (@pdls) { print "$_\n"; } exit;
The output for <data_missing.txt> is:[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
So far, so good. However, if I change the value for $PDL::undefval, I get a strange result. First, the default value of $PDL::undefval is zero.[1 2 3 4 5] [6 7 8 0 10] [11 0 13 14 15]
Here is the code with $PDL::undefval set to -999.perl -MPDL -E 'say $PDL::undefval' 0
The output for <data.txt> is:#!/usr/bin/env perl use strict; use warnings; use PDL; use PDL::IO::Misc; my $file_name = shift; die 'No file given.' unless defined($file_name); open(my $fh, '<', $file_name) or die "Can not open file: $!"; local $PDL::undefval = -999; my @pdls = rcols $fh, { COLSEP => "\t" }; foreach (@pdls) { print "$_\n"; } exit;
The output for <data_missing.txt> is:[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
The value of $PDL::undefval is used in one case (where the 12 was deleted at the end of a row in the input file), but a zero used (where the 9 was deleted in the middle of a row in the input file).[1 2 3 4 5] [6 7 8 0 10] [11 -999 13 14 15]
This looks like a bug to me. Does anyone else have experience using this feature of PDL?
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |