in reply to Convert XLSX to TSV and remove CRLF in cells

The embedded carriage returns can be replaced with '\n' using this simple script on the Red Hat box:
use strict; use warnings; while (<>) { chomp; if (/^M$/) { print "$_\n"; } else { print "$_\\n"; } }
'^M' is single character, entered pressing 'ctrl-v enter'.

Update: The direct conversion does not seem to be hopeless. You can simply omit the $converter if you do not need encoding conversion. I created an xlsx file with ~1.000.000 rows and with only two columns (file size ~10Mb) and it was converted to csv in 140sec.

Replies are listed 'Best First'.
Re^2: Convert XLSX to TSV and remove CRLF in cells
by MidLifeXis (Monsignor) on Jun 16, 2015 at 12:16 UTC

    '^M' is single character, entered pressing 'ctrl-v enter'
    in some editors. In others is may end up displaying as some sort of a line feed. [emphasis added]

    A better, more portable way of encoding this is \015, \o{015}, \cM, \x0d, or some other encoded form that won't potentially be messed up by an editor, printer, code pretty-printer, ….

    --MidLifeXis