in reply to Converting CSV to tab-delimited

And you're sure that M$ doesn't export embedded new-lines, carriage-returns or other binary or special characters?

There is a very good reason for Text::CSV (and the undelying Text::CSV_XS and Text::CSV_PP) modules to be around, and installing isn't that hard.

cpan Text::CSV
use strict; use warnings; my $if = shift; my ($of = $if) =~ s/\.csv$/.tab/ or die "usage: csv2tab file.csv"; open my $fh_i, "<", $if or die "$if: $!"; open my $fh_o, ">", $of or die "$of: $!"; my $csv = Text::CSV->new ({ binary => 1 }); my $tsv = Text::CSV->new ({ binary => 1, sep_char => "\t" }); while (my $row = $csv->getline ($fh_i)) { $tsv->print ($fh_o, $row); } close $fh_i or die "$if: $!"; close $fh_o or die "$of: $!"

Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^2: Converting CSV to tab-delimited
by Tux (Canon) on Apr 14, 2008 at 14:35 UTC

    Strawberry perl comes with a shipload of useful bundled modules and a working cpan.bat. I've been playing with it over the weekend, and the only problems I had with installing new modules from CPAN is the SSL related modules and OS specific modules like BSDresource.

    ActivePerl comes with ppm, which has most used modules available in a few keystrokes. Nothing is holding you from increasing your possibilities here. Try to imagine the time you will have to waste explaning the end-user why this oh so simple script suddenly stops working. I can assure you it is more than the time you need to convince him/her to install something good.

    We've entered an era where updating or installing basic modules that have a proven value, is made very very easy, and will pay off over writing code, as simple as it may seem, that will provide you with headaches in the future.


    Enjoy, Have FUN! H.Merijn
Re^2: Converting CSV to tab-delimited
by ambrus (Abbot) on Apr 15, 2008 at 14:49 UTC
    And you're sure that M$ doesn't export embedded new-lines, carriage-returns or other binary or special characters?

    Fyi, line breaks in cells, which are the most common case, are exported as an LF character whereas rows are separated by CRLF. (This of course might not apply to all versions of excel.)

Re^2: Converting CSV to tab-delimited
by PhilHibbs (Hermit) on Apr 14, 2008 at 14:22 UTC
    Sure, I know that installing Perl modules isn't that hard (once you've tracked down a make program for Windows - and once you've learned to avoid anything that needs a C compiler) but some people just don't want to know. All you have to do is accidentally say "non-standard" and a seemingly-sane business analyst starts to have nightmares about viruses or hackers. I have learned to avoid anything that doesn't come bundled with ActiveState Perl (and thank the g0ds, they've started including Term::ReadKey!!!)

    Most people here are Unix hackers, believe me working with Windows - and habitual Windows users - is really, really frustrating.

    And there are no newlines in the files - I know, it's our software that's creating them.

      Text::CVS can be installed without a C compiler, and even without resorting to the command line, via ActiveState's ppm application.

      And there are no newlines in the files - I know, it's our software that's creating them.

      I suggest amending the introduction to the code and mentioning this, in case someone would want to use the code you posted and isn't sure whether there are newlines in the code.

      Software speaks in tongues of man.
      Stop saying 'script'. Stop saying 'line-noise'.
      We have nothing to lose but our metaphors.

        I suggest amending the introduction to the code and mentioning this,
        You're probably right. Done.
      All well and fine. I have a situation again where ROOTman (as we will call him) does not like anyone adding Perl modules to the production server. Though he's a Perlist himself, requests for additional modules are rebuffed as he cannot risk this machine having a hiccup. We also may be tied to an older release for quite some time, until the new hardware arrives and then we get whatever Perl comes on the RH Enterprise install.

      So, a hand-wired CSV solution is sought by those of us not in a position to "simply ppm or CPAN Text::CSV into place". Good material is sparse - even the CookBook example isn't all that great. I did track down a regex which I have needed to follow up with several checks and edits to patch things up...

      This then is a starting point (ugly/rough code):

      my @inList = split /,(?!(?:[^",]|[^"],[^"])+")/; # and further on a bit of a mess: my @outList = (); for (my $i=0; $i<$flds; $i++) { if (! defined $inList[$i] ) { $inList[$i] = ""; } if ($inList[$i] =~ m/\D/) { $inList[$i] = '"'.$inList[$i].'"'; } $inList[$i] =~ s/^""/"/; $inList[$i] =~ s/""$/"/; $inList[$i] =~ s/^"$/""/; push @outList, $inList[$i]; }

      I eventually got to a point with my data that I simply sanitize all the crap in a field like ",", "'" and """ in self defense, straight after dealing with any nulls.

      I hope this is useful for someone.