in reply to Re^2: Best Way To Parse Concordance DAT File Using Modern Perl?
in thread Best Way To Parse Concordance DAT File Using Modern Perl?
Since U+FEFF is (a) unlikely to be present anywhere else besides the beginning of the input file and (b) interpreted as a "zero width non-breaking space" if it were to be found anywhere else in the file (i.e., it lacks any linguistic semantics whatsoever), it should suffice to just delete it - here's a version of your snippet that doesn't produce an error:
I suppose it's kind of sad that you need to do that for Text::CSV to work, but at least it works.use Encode qw( decode_utf8 ); use Text::CSV_XS; my $csv_bytes = qq/\x{feff}"Field One","Field 2",3,4,"Field 5"\r\n/; my $csv_record = decode_utf8($csv_bytes); $csv_record =~ tr/\x{feff}//d; ## Add this, and there's no error. my $csv = Text::CSV_XS->new( { auto_diag => 1 } ); $csv->parse($csv_record);
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Best Way To Parse Concordance DAT File Using Modern Perl?
by Jim (Curate) on Dec 10, 2012 at 22:52 UTC | |
by graff (Chancellor) on Dec 11, 2012 at 07:22 UTC | |
by Anonymous Monk on Dec 11, 2012 at 07:40 UTC |