Re^2: A problem with Text::CSV

Hi, thank you for the fast response:)

I actually changed the while loop and found some of the records are from the trailing ^M, and after I removed them, I missed 137 records now, and I can find the problematic records with line_number anyway by now:-) I will check if there are embedded newlines within these fields..

many thanks

lihao

while (my $record = <$csvfile>) {
    $line_no++;
    chomp $record;
    $record =~ s/\cM$//;
    if ($csv->parse($record)) {
        my @columns = $csv->fields();
        my $value  = "$columns[9], $columns[2]";
        printf {$fout} "[good][%06d] %s\n", $line_no, $value;
    } else {
        printf {$fout} "[bad][%06d] %s\n", $line_no, $record;
    }
}
[download]

Comment on Re^2: A problem with Text::CSV Download Code

Replies are listed 'Best First'.
Re^3: A problem with Text::CSV by Tux (Canon) on Mar 27, 2008 at 07:36 UTC
This is changing your script from good to bad. This is exactly what you should NOT do. The trailing `^M` is part of the field and should not be removed. Use the comma-counting code from Narveson, and check if the lines that have a trailing `^M` also have less comma's than the lines that seem to be correct. Note that even that is unreliable, as comma's can be part of a field when correctly quoted. Best way to find the problematic lines (if any) is to call the `new ()` constructor with no arguments at all, and see where the parsing stops. Then use Text::CSV's diagnostics to see what caused the stop. Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: A problem with Text::CSV
by Tux (Canon) on Mar 27, 2008 at 07:36 UTC

This is changing your script from good to bad. This is exactly what you should NOT do. The trailing ^M is part of the field and should not be removed.

Use the comma-counting code from Narveson, and check if the lines that have a trailing ^M also have less comma's than the lines that seem to be correct. Note that even that is unreliable, as comma's can be part of a field when correctly quoted.

Best way to find the problematic lines (if any) is to call the new () constructor with no arguments at all, and see where the parsing stops. Then use Text::CSV's diagnostics to see what caused the stop.

Enjoy, Have FUN! H.Merijn

[reply]
[d/l]
[select]