in reply to splitting cvs file without line breaks

This should work for a variable number of fields provided that:

  1. Quoted fields don't contain quotes.
  2. The last field in each record is either quoted, or does not contain spaces.
#! perl -slw use strict; my $data = do{ local $/; <DATA> }; my $noOfFieldsMinus1 = 5; my @records = $data =~ m[ ( (?: (?: "[^"]+" ##" | [^,]+ ) , ){$noOfFieldsMinus1} (?: "[^"]+" ##" | \S+ ) ) \s+ ]gx; print for @records; __DATA__ "Business Date","Location Name","Revenue Center Name","Tender Count"," +Tender Name","Tender Total" 2007-05-14 00:00:00.0,"Aville","x",300,"b +",6899 2007-05-14 00:00:00.0,"Aville","x",6,"c",198.50 2007-05-14 00: +00:00.0,"Aville","b",290,"Cash",12336.10 2007-05-14 00:00:00.0,"Bvill +e","c",14,"d",958.40

Produces:

C:\test>615576 "Business Date","Location Name","Revenue Center Name","Tender Count"," +Tender Name","Tender Total" 2007-05-14 00:00:00.0,"Aville","x",300,"b",6899 2007-05-14 00:00:00.0,"Aville","x",6,"c",198.50 2007-05-14 00:00:00.0,"Aville","b",290,"Cash",12336.10 2007-05-14 00:00:00.0,"Bville","c",14,"d",958.40

Subsequent treatment of the csvs by any of the usual mechanisms.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: splitting cvs file without line breaks
by rendier (Initiate) on May 15, 2007 at 21:06 UTC
    Hey, great!! This is exactly what I was looking for!! I'll read this slowly with perlre next to me, and then stick it in ;-) Thanks a lot