kgherman has asked for the wisdom of the Perl Monks concerning the following question:

I have a program in Perl that reads one line at a time from an input data file and computes certain statistics. Then, it writes the result in a separate output file and reads a new line from the input data file, and so on... Every now and then, while the program reads through my input data file, I get a warning about an "uninitialized value" and I would like to know which line in the input data file generates this warning. Is there any way I can tell Perl to print (to screen or file) the data point that is generating the error? My actual code is pretty long so I'm posting just a portion of it to give an idea (together with the error message that I get when running the program.)
#!/usr/bin/perl use strict; use warnings; use diagnostics; use Data::Dumper; open (F, "input.csv"); my @lines=<F>; close(F); open (F1,'>>output.csv'); define my variables; for (my $i=1; $i<@lines; $i++){ read in one line of data; do stuff; write to file; } close(F1);
The type of error I get when running the program is: "Use of uninitialized value $Top5 in division (/) at builder.pl line 403 (#1)

Replies are listed 'Best First'.
Re: Find data point generating Error in Perl code
by Anonymous Monk on Mar 17, 2015 at 00:44 UTC

    Could you show some example code? Normally, Perl tracks the last file and line read, and will usually add that information to the warning messages:

    use warnings; while(<DATA>) { warn if /X/ } __DATA__ foo bar quzX

    Should generate a message like "Warning: something's wrong at - line 2, <DATA> line 3.". Might be interesting to try and find out why that's not happening in your case.

    But there are ways that you can do this yourself, for example this node shows how to use a __WARN__ %SIG handler to get custom information into warning messages.

      I'm not sure I understand how to implement your suggestion: what is DATA that you use in you while loop?

        Quoting perldata:

        The two control characters ^D and ^Z, and the tokens __END__ and __DATA__ may be used to indicate the logical end of the script before the actual end of file. Any following text is ignored.

        Text after __DATA__ may be read via the filehandle PACKNAME::DATA, where PACKNAME is the package that was current when the __DATA__ token was encountered. The filehandle is left open pointing to the line after __DATA__. The program should close DATA when it is done reading from it. (Leaving it open leaks filehandles if the module is reloaded for any reason, so it's a safer practice to close it.) For compatibility with older scripts written before __DATA__ was introduced, __END__ behaves like __DATA__ in the top level script (but not in files loaded with require or do) and leaves the remaining contents of the file accessible via main::DATA.

        See SelfLoader for more description of __DATA__, and an example of its use. Note that you cannot read from the DATA filehandle in a BEGIN block: the BEGIN block is executed as soon as it is seen (during compilation), at which point the corresponding __DATA__ (or __END__) token has not yet been seen.

        Executive summary: it's a special filehandle that allows you to embed data in a Perl script itself, after a __DATA__ token indicating the end of the script proper.

        In the above example, reading from DATA will successively read the lines "foo\n", "bar\n" and "quzX\n", exactly as if you'd opened a file containing them and read from the filehandle you'd got.

        AppleFritter already gave you the docs, just to add to that: DATA doesn't directly relate to your problem, it was just a quick way to show an example of what the warning message might look like. The data can very well reside in a file and the messages would be almost the same.

      I have added a schematic of what my code does; the actual code is pretty long so I'm not sure how to make it available for you.
Re: Find data point generating Error in Perl code ( fatal warnings)
by Anonymous Monk on Mar 17, 2015 at 00:23 UTC
    Sure
    $ perl #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; open my $fake, '<', \"row\nrow\n\n\n"; while( <$fake> ){ my( $word ) = /(\w+)/; use warnings qw/ FATAL all /; eval { print "$word\n"; 1 } or dd({ OOPS => $@, line => $_ }); } __END__ row row { line => "\n", OOPS => "Use of uninitialized value \$word in concatenation (.) or s +tring at - line 9, <\$fake> line 3.\n", } { line => "\n", OOPS => "Use of uninitialized value \$word in concatenation (.) or s +tring at - line 9, <\$fake> line 4.\n", }

      As a possibly interesting side note, just today a patch was submitted to P5P making FATAL warnings officially discouraged, due to multiple issues with them.

        As a possibly interesting side note, just today a patch was submitted to P5P making FATAL warnings officially discouraged, due to multiple issues with them.

        That is stupid

      I'm trying to implement what you posted but I'm pretty lost: what is "$fake"? what is this code supposed to do? Thanks for all your help!

        I'm trying to implement what you posted but I'm pretty lost: what is "$fake"? what is this code supposed to do? Thanks for all your help!

        $fake is a filehandle , there is no physical file, its an in-memory file, so I called it $fake

        What this code does is demonstrate how to trap warnings by making them fatal, so that you can print out the current line of the file

        It does this by promoting warn to die which is caught with eval

        Its an answer to your question

Re: Find data point generating Error in Perl code
by Anonymous Monk on Mar 17, 2015 at 19:25 UTC

    Thanks for the update, now it's clear why Perl doesn't automagically know the input file information: You're reading the entire file into @lines, and by the time you write to output.csv, input.csv is already closed.

    The quickest fix would be to implement the %SIG __WARN__ handler as described here. Here's a quick example (my input.txt contains just four lines, "foo", "bar", "quz" and "baz"):

    open my $infh, '<', 'input.txt' or die $!; my @lines = <$infh>; close $infh; # open output file here my $i; local $SIG{__WARN__} = sub { my $msg = shift; $msg =~ s/\.?\s*$//; warn "$msg (input line $i)\n"; }; for($i=0;$i<@lines;$i++) { # do calculations and write to output file here # example warning: warn "found a z" if $lines[$i]=~/z/; } __END__ # Script Output: found a z at - line 14 (input line 2) found a z at - line 14 (input line 3)

    Note that the "line numbers" reported here are 0-based, as it's actually an index into the array of lines. But the advantage is that here, the message is fully customizable, so you're free to print $i+1 or whatever else you like.

    However, unless you've oversimplified your example, your code could be made more efficient if it were to keep the input file open while writing the output file. Also, as long as the input file stays open, Perl will normally add the input file line numbers to warning messages by itself. So for example:

    open my $infh, '<', 'input.txt' or die $!; # open output file here while( my $line = <$infh> ) { # do calculations and write to output file here # example warning: warn "found a z" if $line=~/z/; } close $infh; __END__ # Script Output: found a z at - line 6, <$infh> line 3. found a z at - line 6, <$infh> line 4.

    By the way, your code would be better if you used the three-argument open and error handling (or die), as in the above examples. Also, you really should look into Text::CSV for better handling of CSV files!

      Thank you, thank you, thank you! Your post is a goldmine of valuable information.

      I have followed your suggestion and now I keep the input file handle open while writing the output. As you had guessed, this allowed me to have the number of the line in the input file added to the warning message! thank you!

      I will definitely look into the Text::CSV module since I only work with CSV files. Also, I will implement the three-argument open and error handling notation.

      I have one last question: I was looking at the last code that you have written in your answer and I was wondering: does your notation mean that you read one line at the time from the CSV file or, like in mine, do you load the entire file in memory and then work on it?

        A while loop over a filehandle like while(my $line = <$fh>) { ... } reads the file line-by-line* (one line is read on each iteration of the loop), as opposed to foreach my $line (<$fh>) { ... } or my @lines = <$fh>;, which reads the entire file first (obviously not particularly friendly on memory for large inputs).

        For more information, see I/O Operators in perlop and readline.

        * Perl's definition of what a "line" is may be changed via the input record separator $/.

        ... since I only work with CSV files.

        Text::CSV!

        I will definitely look into the Text::CSV module...

        Yes! ;-)