Find data point generating Error in Perl code

kgherman has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Find data point generating Error in Perl code by Anonymous Monk on Mar 17, 2015 at 00:44 UTC
Could you show some example code? Normally, Perl tracks the last file and line read, and will usually add that information to the warning messages: `use warnings; while(<DATA>) { warn if /X/ } __DATA__ foo bar quzX` [download] Should generate a message like "`Warning: something's wrong at - line 2, <DATA> line 3.`". Might be interesting to try and find out why that's not happening in your case. But there are ways that you can do this yourself, for example this node shows how to use a `__WARN__` %SIG handler to get custom information into warning messages.	[reply] [d/l] [select]
Re^2: Find data point generating Error in Perl code by kgherman (Novice) on Mar 17, 2015 at 18:20 UTC
I'm not sure I understand how to implement your suggestion: what is DATA that you use in you while loop?	[reply]
Re^3: Find data point generating Error in Perl code by AppleFritter (Vicar) on Mar 17, 2015 at 19:07 UTC
Quoting perldata: The two control characters `^D` and `^Z`, and the tokens `__END__` and `__DATA__` may be used to indicate the logical end of the script before the actual end of file. Any following text is ignored. Text after `__DATA__` may be read via the filehandle `PACKNAME::DATA`, where `PACKNAME` is the package that was current when the `__DATA__` token was encountered. The filehandle is left open pointing to the line after `__DATA__`. The program should close `DATA` when it is done reading from it. (Leaving it open leaks filehandles if the module is reloaded for any reason, so it's a safer practice to close it.) For compatibility with older scripts written before `__DATA__` was introduced, `__END__` behaves like `__DATA__` in the top level script (but not in files loaded with require or do) and leaves the remaining contents of the file accessible via `main::DATA`. See SelfLoader for more description of `__DATA__`, and an example of its use. Note that you cannot read from the `DATA` filehandle in a BEGIN block: the BEGIN block is executed as soon as it is seen (during compilation), at which point the corresponding `__DATA__` (or `__END__`) token has not yet been seen. Executive summary: it's a special filehandle that allows you to embed data in a Perl script itself, after a `__DATA__` token indicating the end of the script proper. In the above example, reading from `DATA` will successively read the lines "`foo\n`", "`bar\n`" and "`quzX\n`", exactly as if you'd `open`ed a file containing them and read from the filehandle you'd got.	[reply]
Re^3: Find data point generating Error in Perl code by Anonymous Monk on Mar 17, 2015 at 19:09 UTC
AppleFritter already gave you the docs, just to add to that: `DATA` doesn't directly relate to your problem, it was just a quick way to show an example of what the warning message might look like. The data can very well reside in a file and the messages would be almost the same.	[reply] [d/l]
Re^2: Find data point generating Error in Perl code by kgherman (Novice) on Mar 17, 2015 at 18:24 UTC
I have added a schematic of what my code does; the actual code is pretty long so I'm not sure how to make it available for you.	[reply]
Re: Find data point generating Error in Perl code ( fatal warnings) by Anonymous Monk on Mar 17, 2015 at 00:23 UTC
Sure $ perl #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; open my $fake, '<', \"row\nrow\n\n\n"; while( <$fake> ){ my( $word ) = /(\w+)/; use warnings qw/ FATAL all /; eval { print "$word\n"; 1 } or dd({ OOPS => $@, line => $_ }); } __END__ row row { line => "\n", OOPS => "Use of uninitialized value \$word in concatenation (.) or s +tring at - line 9, <\$fake> line 3.\n", } { line => "\n", OOPS => "Use of uninitialized value \$word in concatenation (.) or s +tring at - line 9, <\$fake> line 4.\n", } [download]	[reply] [d/l]
Re^2: Find data point generating Error in Perl code ( fatal warnings) by Anonymous Monk on Mar 17, 2015 at 01:01 UTC
As a possibly interesting side note, just today a patch was submitted to P5P making `FATAL` warnings officially discouraged, due to multiple issues with them.	[reply] [d/l]
Re^3: Find data point generating Error in Perl code ( fatal warnings) by Anonymous Monk on Mar 17, 2015 at 02:31 UTC
As a possibly interesting side note, just today a patch was submitted to P5P making FATAL warnings officially discouraged, due to multiple issues with them. That is stupid	[reply]
Re^4: Find data point generating Error in Perl code ( fatal warnings) by Anonymous Monk on Mar 17, 2015 at 18:09 UTC
Re^5: Find data point generating Error in Perl code ( fatal warnings) by Anonymous Monk on Mar 17, 2015 at 23:50 UTC
Re^2: Find data point generating Error in Perl code ( fatal warnings) by kgherman (Novice) on Mar 17, 2015 at 19:17 UTC
I'm trying to implement what you posted but I'm pretty lost: what is "$fake"? what is this code supposed to do? Thanks for all your help!	[reply]
Re^3: Find data point generating Error in Perl code ( fatal warnings) by Anonymous Monk on Mar 17, 2015 at 23:55 UTC
I'm trying to implement what you posted but I'm pretty lost: what is "$fake"? what is this code supposed to do? Thanks for all your help! $fake is a filehandle , there is no physical file, its an in-memory file, so I called it $fake What this code does is demonstrate how to trap warnings by making them fatal, so that you can print out the current line of the file It does this by promoting warn to die which is caught with eval Its an answer to your question	[reply]
Re: Find data point generating Error in Perl code by Anonymous Monk on Mar 17, 2015 at 19:25 UTC
Thanks for the update, now it's clear why Perl doesn't automagically know the input file information: You're reading the entire file into `@lines`, and by the time you write to `output.csv`, `input.csv` is already closed. The quickest fix would be to implement the %SIG `__WARN__` handler as described here. Here's a quick example (my `input.txt` contains just four lines, "foo", "bar", "quz" and "baz"): `open my $infh, '<', 'input.txt' or die $!; my @lines = <$infh>; close $infh; # open output file here my $i; local $SIG{__WARN__} = sub { my $msg = shift; $msg =~ s/\.?\s*$//; warn "$msg (input line $i)\n"; }; for($i=0;$i<@lines;$i++) { # do calculations and write to output file here # example warning: warn "found a z" if $lines[$i]=~/z/; } __END__ # Script Output: found a z at - line 14 (input line 2) found a z at - line 14 (input line 3)` [download] Note that the "line numbers" reported here are 0-based, as it's actually an index into the array of lines. But the advantage is that here, the message is fully customizable, so you're free to print `$i+1` or whatever else you like. However, unless you've oversimplified your example, your code could be made more efficient if it were to keep the input file open while writing the output file. Also, as long as the input file stays open, Perl will normally add the input file line numbers to warning messages by itself. So for example: `open my $infh, '<', 'input.txt' or die $!; # open output file here while( my $line = <$infh> ) { # do calculations and write to output file here # example warning: warn "found a z" if $line=~/z/; } close $infh; __END__ # Script Output: found a z at - line 6, <$infh> line 3. found a z at - line 6, <$infh> line 4.` [download] By the way, your code would be better if you used the three-argument open and error handling (`or die`), as in the above examples. Also, you really should look into Text::CSV for better handling of CSV files!	[reply] [d/l] [select]
Re^2: Find data point generating Error in Perl code by kgherman (Novice) on Mar 17, 2015 at 20:45 UTC
Thank you, thank you, thank you! Your post is a goldmine of valuable information. I have followed your suggestion and now I keep the input file handle open while writing the output. As you had guessed, this allowed me to have the number of the line in the input file added to the warning message! thank you! I will definitely look into the Text::CSV module since I only work with CSV files. Also, I will implement the three-argument open and error handling notation. I have one last question: I was looking at the last code that you have written in your answer and I was wondering: does your notation mean that you read one line at the time from the CSV file or, like in mine, do you load the entire file in memory and then work on it?	[reply]
Re^3: Find data point generating Error in Perl code by Anonymous Monk on Mar 17, 2015 at 21:25 UTC
A `while` loop over a filehandle like `while(my $line = <$fh>) { ... }` reads the file line-by-line* (one line is read on each iteration of the loop), as opposed to `foreach my $line (<$fh>) { ... }` or `my @lines = <$fh>;`, which reads the entire file first (obviously not particularly friendly on memory for large inputs). For more information, see I/O Operators in perlop and readline. * Perl's definition of what a "line" is may be changed via the input record separator $/. ... since I only work with CSV files. Text::CSV! I will definitely look into the Text::CSV module... Yes! `;-)`	[reply] [d/l] [select]
Re^4: Find data point generating Error in Perl code by kgherman (Novice) on Mar 19, 2015 at 18:06 UTC
Re^5: Find data point generating Error in Perl code by Anonymous Monk on Mar 19, 2015 at 18:23 UTC
Some notes below your chosen depth have not been shown here