If you don't mind losing the tab within the quotes, pre-process the string to remove those tabs.
Here I replaced the embedded tabs with spaces, then just split on tab:
my $var='474627 asidase ta sidase ala,"lpha-D- ctoside gtohydrol
+ase","razyme","arazyme (enz Corp)","Melie","lagal","idase bta",
+ rug 00103';
my $tmp;
$var =~ s{ ("[^"]+") }{ ($tmp = $1) =~ s/\t/ /g; $tmp }xge;
my @each=split(/\t/,$var);
for my $eachvar(@each)
{
print "$eachvar\n";
}
Update 1: Oops, I made a mistake in the pattern.
The quotes belong on the inside of the capture.
(Was: "([^"]+)", Now: ("[^"]+").
Update 2: In response to a private message, here's a little better explanation of the pattern:
# Using s{}{} form of substitute.
# Substitute supports using several different separator formats
# which helps one avoid having to escape things (like '/') within the
+pattern.
# The 'x' option which means ignore whitespace so that comments can be
+ easily inserted.
# The 'g' option is global obviously.
# The 'e' option says that the replacement part of the pattern is a pe
+rl expression.
$var =~
s{
("[^"]+") # Matches two quotes and content between them.
# Capture the match for use in the replacement
+.
#
# Disection of pattern:
# ("[^"]+") = full pattern
# ( ) = capture everything between pare
+ntheses.
# " " = quotes at start and end of patt
+ern.
# [^"]+ = one or more non-quote character
+s
}
{ # The replacement part is a perl expressio
+n.
# Original: ($tmp = $1) =~ s/\t/ /g;
# is same as next 2 lines:
$tmp = $1; # Make a copy of the captured match.
$tmp =~ s/\t/ /g; # Replace tabs with spaces throughout the
+match.
$tmp; # Use resultant value for replacement.
}xge;
# x = ignore white space and comments
# g = global
# e = expression
|