As others have suggested, your data probably contains more colons than you expect on some lines, and a field like "notes_comments" probably has a decent chance of containing colons as data.
Try this one-liner on your "gleandata.csv" file, and see what you get:
perl -ne '$n=tr/://;$h{$n}++; END{
print "$h{$_} lines have $_ colons\n" for(sort{$a<=>$b} keys %h)}' <
+ gleandata.csv
(You should put that on the shell command line as a single line -- I just broke it up to avoid the bothersome "+" in the node display.)
If there were only one field that contained colons as data, and you could figure out how to make that field come last on each line, you could try doing your split like this:
my @row = split(/:/, $line, 24);
That way, only 24 elements will be returned, and any "extra" delimiters will just be kept inside the 24th element. But a better approach would be:
- confirm that appropriate quotes and/or escapes are used in the file when fields contain delimiter characters as data, and parse the file with Text::CSV or Text::xSV, OR
- use a delimiter character that never shows up as field data (tab is usually good) OR
- condition field contents as needed to replace delimiter characters inside the field data with "safe" alternatives (e.g. semi-colon or comma or hyphen instead of colon)
(updated last bullet point in hopes of making it clearer)
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.