Hi PMs! I've CSV file, comma delimited, embedded (, & ") in double quotes; as you can see in row given below.
6450458,6011,"Urine - Culture & Sensitivity",1658,"Colony Count:","10^ +3 cfu/ml",,2016-10-26 09:55:34,0,"", ,"",,2016-10-26 09:55:34,SS00002 +03,6,"All Tests Done and Verified",,SCIN72669,2016-10-24 12:04:58,21, +"Max Smart Super, Speciality Hospital "O"",3445,"Bansidhar Tarai ",0, +"SAVITRI DEVI",False,"SAVITRI ","DEVI",SKCT,334,20905,"Anjani Kuma +r Agrawal",1957-01-01 00:00:00,0,NULL,"7838457000","INFO@MAXHEALTHCA +RE.COM",OP,NO,Verified,2016-10-24 12:04:58,2016-10-24 12:04:58,Lab,24 +3981,0,"",F 6444885,21732,"Blood - Culture & Sensitivity",3147,"Method BacT/ALERT3 +D & Vitek 2","SubHead",,2016-10-26 09:00:11,1,"min", ,"",0,2016-10-26 + 09:00:11,PM0004746,6,"All Tests Done and Verified",,PMIN4335,2016-10 +-21 19:07:36,25,"PMC",3445,"Bansidhar Tarai ",0,"SUSHILA DEVI 94907", +False,"SUSHILA DEVI","94907",PMCL,1861,69142,"Parkash Gera",1961-01 +-01 00:00:00,0,NULL,"0143000000","INFO@MAXHEALTHCARE.COM",OP,NO,Verif +ied,2016-10-21 19:07:36,2016-10-21 19:07:36,Lab,781642,0,"",F 6444891,21732,"Blood - Culture & Sensitivity",3147,"Method BacT/ALERT3 +D & Vitek 2","SubHead",,2016-10-26 09:00:36,1,"min", ,"",0,2016-10-26 + 09:00:36,PM0004748,6,"All Tests Done and Verified",,PMIN4337,2016-10 +-21 19:11:24,25,"PMC",3445,"Bansidhar Tarai ",0,"TUSHAR BHATIA 94916" +,False,"TUSHAR BHATIA","94916",PMCL,1876,69142,"Parkash Gera",1985- +01-01 00:00:00,0,NULL,"0143000000","INFO@MAXHEALTHCARE.COM",OP,NO,Ver +ified,2016-10-21 19:11:24,2016-10-21 19:11:24,Lab,773211,0,"",M

Requirement: Sort file based on 3 columns in file i.e. primary: 31, secondary: 1, tertiary: 2 i.e. sorted as below, 1st -> 2nd -> 3rd

1,2,3 1,2,4 1,3,4 2,1,5

What I'm doing: i.) converting file to pipe(|) separated from comma(,) -> using Text::CSV module ii.) Sorting using File::Sort Here is the code snippet:

&commaToPipeDelimiter($maxFile, $pipeMaxFile); sort_file({t => '|', k => ['31n', '1n', '2n'], I => $pipeMaxFile, o => + $sortedMaxFile}); sub commaToPipeDelimiter{ my ($file, $pfile) = @_; my $csv = Text::CSV->new({binary => 1, decode_utf8 => 1, auto_diag + => 1, allow_loose_quotes => 1}); open(my $data, '<:encoding(utf8)', $file) or die "Could not open ' +$file' $!\n"; open(W, ">".$pfile) || die "Could not open $pfile $!\n"; while(my $line = <$data>){ chomp $line; if($csv->parse($line)){ my @fields = $csv->fields(); print W join("|",@fields),"\n"; } else{ warn "Line could not be parsed: $!\n"; } }

Is there some other efficient way someone can suggest? Rather than converting file into pipe separated file then sort since files could be much larger. FYI -> Embedded commas need to be taken care of.

-Chetan

In reply to Need to sort comma delimited, double quoted file by CSharma

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.