in reply to Re^6: tab delimited extraction, formatting the output
in thread tab delimited extraction, formatting the output

I'm a little surprised that the loop exits on null values, but the fix is easy. When used in a logical context, null values and zeros evaluate to false. The easy solution is to test values with the defined function in place of just testing the value ( i.e. replace while (my $data_ref = $csv->getline($fh)) with while (defined(my $data_ref = $csv->getline($fh))) ). This will only return false if the value in the tested variable is undef.

Replies are listed 'Best First'.
Re^8: tab delimited extraction, formatting the output
by zzgulu (Novice) on Feb 12, 2009 at 18:31 UTC
    mmm, I guess that was not the problem. The text file that I am prcessing is very larg, it has over 8 million lines and the utterance that scripts aborts at(without any error message)is at line 43,100. The only unusual thing that I noticed first near the last utterance processed was that the column in front of p ($p_value) was empty. The last utterance was "History of present illness:" my nlp software (MMTX) prases it into three phrases : "History","of present illness" and "" which is actaully the punctuation":" First I thought ":" is the problem but apparently it is not since there are multiple instances before this one from the very first lines. Do you think this is a size/buffer issue?
      It's possible, though the posted code shouldn't be caching any information. You can make sure it's not an issue with Text::CSV by running the code after removing the contents of the while loop.
        can someone help me understand why in the following code $mc_value being printed a line after $p_value and not infront of it with a tab distance of $p_value? Removing "\n" didn't help.
        Also, if I want to send the ouput to a file, should I put OUT infront of every print function? Thank you so much for your hints

        #!/usr/bin/perl use strict; use warnings; my $file = "c:/ubuntu/regular.txt"; #open OUT,">C:/output/filded_processed.txt"; open my $fh, "<", $file or die "Unable to open $file: $!"; my($u_value, $p_value, $mc_value) = (undef) x 3; while (my $line=<$fh>) { if ($line=~/\n\n\n/){ ($u_value, $p_value, $mc_value) = (undef) x 3; print "\n"; } elsif ($line=~/\bProcessing\s/) { $line=~s/\bProcessing\s\d+\.tx\.\d+: //; $u_value = $line; print "\n$u_value\n"; undef $p_value; } elsif ($line=~/\bPhrase/) { $line=~s/\bPhrase: //; $line=~s/\"//g; if ($p_value) { print "\n" . ' ' x length $u_value;} $p_value=$line; print "\t$p_value"; undef $mc_value; } elsif ($line=~/\s\s/ ) { if ($mc_value) { print "\t" . ' ' x length $p_value;} $mc_value=$line; print "\t$mc_value"; } else { #die "Unexpected line format encountered, $file, @data"; } } close $fh;