Re: Recommendations for parsing invalid CSV

by mscharrer (Hermit)
on Apr 21, 2008 at 14:54 UTC ( #681931=note: print w/replies, xml ) Need Help??

in reply to Recommendations for parsing invalid CSV

You could use look-behind (?<= ) and look-ahead (?= ) expressions to look only for quotes which are not beside a comma or at end and start of the line:
s/(?<=.)(?<!,)"(?!,|$)/\\"/g A simple test in the command line brings me:
user@machine$ perl -pe 's/(?<=.)(?<!,)"(?!,|$)/\\"/g' "call from "friend"","call from "friend"","call from "friend"" "call from \"friend\"","call from \"friend\"","call from \"friend\""
The 2nd line is input, the 3rd output.

Looks good for me. Some special cases with escaped commas inside the strings might not match correctly. You should check this.

Re^2: Recommendations for parsing invalid CSV
by markjugg (Curate) on Apr 21, 2008 at 15:24 UTC

    I didn't mean for anyone else to do my work for me, but I certainly don't mind the help! Thank you! That's the kind of code I would need to write if the new release of Text::CSV_XS turns out not to work for some reason.

Node Type: note [id://681931]
