in reply to Re: The best way to split tab delimited file
in thread The best way to split tab delimited file

Below is the data,where the regex is not working..
my $var='474627 asidase ta sidase ala,"lpha-D- ctoside gtohydrolase +","razyme","arazyme (enz Corp)","Melie","lagal","idase bta", + rug 00103';

Replies are listed 'Best First'.
Re^3: The best way to split tab delimited file
by gmargo (Hermit) on Nov 23, 2009 at 17:53 UTC

    There are no tab characters in that line. Presumably due to a cut/paste issue. Can you try again to get the tabs in there? And also show the output you expect vs the output you are getting?

      my $var='474627 asidase ta sidase ala,"lpha-D- ctoside gtohydrolase +","razyme","arazyme (enz Corp)","Melie","lagal","idase bta", ru +g 00103'; my @each=split(/(?<!,)\t/,$var); for my $eachvar(@each) { print "$eachvar\n"; }

      This is the output i get

      474627 asidase ta sidase ala,"lpha-D- ctoside gtohydrolase","razyme","arazyme (enz Corp)","Melie","lagal","idase bta", rug 00103

      This is the output i expect

      474627 asidase ta sidase ala,"lpha-D- ctoside gtohydrolase","razyme","arazyme (enz Co +rp)","Melie","lagal","idase bta", rug 00103

        So in addition to ignoring a tab after a comma, you also want to ignore tabs within quoted strings. This revises the original spec just a bit.

        Do you really require both of these things? Or is the real requirement only the latter (ignore tabs within strings), and the original example just happened to have been derived from a string with a comma-tab?

        Also, I get this output, different from yours. (no tab after 'ta') (And you have a comma-tab, just before 'rug', that you do want to split.)

        474627 asidase ta sidase ala,"lpha-D- ctoside gtohydrolase","razyme","arazyme (enz Corp)","Melie","lagal","idase bta", rug 00103