use strict;
use warnings;
open my $fh, ">", "tmp.csv" or die "Unable to open $! \n";
#read the file line by line and delimit with commas
print join ",", map {s/(.*,.*)/"$1"/; $_} split /~/ while <DATA>;
close $fh;
__DATA__
col1~col2~col3~col4~col5
data11~data12~data13~data14~data15
data21~data22~data23~data24~data25
data31~data32~data33~data34~data35
data,data41~data42~data43~data44~data45
data51,data52,data,junk,specialchar,sometingdata53~data54~data55
Prints:
col1,col2,col3,col4,col5
data11,data12,data13,data14,data15
data21,data22,data23,data24,data25
data31,data32,data33,data34,data35
"data,data41",data42,data43,data44,data45
"data51,data52,data,junk,specialchar,sometingdata53",data54,data55
Perl's payment curve coincides with its learning curve.
| [reply] [d/l] [select] |
use strict ; use warnings ;
print join ",", map { if (m/[,"]/) { s/(\A|"|\n|(?<!\n)\Z)/"$1/g } ; $
+_ }
split /~/ while <DATA>;
__DATA__
col1~col2~col3~col4~col5
data11~data12~data13~data14~da,data15
data21~"data22"~d"ata"23~data24~da"ta"25
data31~data32~data33~data34~"data35"
data,data41~data42~data43~data44~data45
data51,data52,data,junk,specialchar,sometingdata53~data54~data55
which gives:
col1,col2,col3,col4,col5
data11,data12,data13,data14,"da,data15"
data21,"""data22""","d""ata""23",data24,"da""ta""25"
data31,data32,data33,data34,"""data35"""
"data,data41",data42,data43,data44,data45
"data51,data52,data,junk,specialchar,sometingdata53",data54,data55
(the last item on each line has the line ending attached to it... so have to box a little clever to get the trailing '"' right in all cases. GrandFather used (.*), which won't match a line ending, unless you tell it to.)
Or you can just stick '"' around every item: print join '",', map { s/(\A|"|\n)/"$1/g ; $_ } split /~/ while <DATA>
+;
which is slightly more straightforward:
"col1","col2","col3","col4","col5"
"data11","data12","data13","data14","da,data15"
"data21","""data22""","d""ata""23","data24","da""ta""25"
"data31","data32","data33","data34","""data35"""
"data,data41","data42","data43","data44","data45"
"data51,data52,data,junk,specialchar,sometingdata53","data54","data55"
One assumes that you don't expect your '~' item separators to appear in any item, not even within '"' or any other escaping mechanism... The deeper you go into this kind of thing, the more you find how useful stuff on CPAN is !
| [reply] [d/l] [select] |
Here is the simple regex logic without spliting the input
#! /usr/bin/perl
use strict;
use warnings;
open my $fh,">","tmp.csv" or die "Unable to open $! \n";
while(<DATA>){
$_ =~ s/(\S+,)+([^~]*)/"$1$2"/g if $_ =~ /,/;
$_ =~ s/~/,/g;
print $fh "$_";
}
close $fh;
__DATA__
col1~col2~col3~col4~col5
data11~data12~data13~data14~data15
data21~data22~data23~data24~data25
data31~data32~data33~data34~data35
data,data41~data42~data43~data44~data45
data51,data52,data,junk,specialchar,sometingdata53~data54~data55
| [reply] [d/l] |