I would like to find a good logic to parse the data

greatshots has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

In my Input file I could see 3 different types of lines as specified below. If I split the lines using ',' The results are weired for the first line. Because the first line contains lot of ','s inbetween Double quote. How can I make my parsing logic to work perfectly. I need your logics to parse this files. I can write a program. Thanks a lot,

__DATA__
Submitted,"696,028","50,946","810,590","836,505","13,923,241","13,776,
+443","14,179,619","14,614,558","14,704,885","14,634,911","15,055,774"
+,"15,127,534","14,458,899","14,403,378","14,566,425","14,644,406","14
+,524,069"
Expired,245,275,273,248,240,295,353,316,371,398,387,352,310,288,405,27
+4,270
Less in,90.12%,90.49%,90.04%,89.55%,90.09%,90.63%,90.37%,90.48%,90.73%
+,90.59%,90.83%,90.40%,88.82%,90.71%,90.72%,90.69%,91.04%
[download]

The output should look like as below,

Field1    Field2   field3  field4  field5   ..... Fieldn
Submitted 696,028  50,946  810,590 836,505  ..... blahblah
Expired   245       275     273    248      ......blahblah
Less      90.12%    90.49%  90.04% 89.55%   ......blahblah
[download]

Comment on I would like to find a good logic to parse the data Select or Download Code

Replies are listed 'Best First'.
Re: I would like to find a good logic to parse the data by friedo (Prior) on May 25, 2007 at 02:55 UTC
Text::CSV_XS will handle this correctly.	[reply]
Re: I would like to find a good logic to parse the data by naikonta (Curate) on May 25, 2007 at 13:32 UTC
I appreciate your intent to roll your on, but I instead suggest to use Text::ParseWords, it's part of Perl standard distribution. You can learn the logic from there, or from other module suggested by other monks in this thread. #!/usr/bin/perl use strict; use warnings; use Text::ParseWords; while (<DATA>) { chomp; my @parts = parse_line(',', 0, $_); print join(' ', map { "[$_]" } @parts), "\n"; } __DATA__ Submitted,"696,028","50,946","810,590","836,505","13,923,241","13,776, +443","14,179,619","14,614,558","14,704,885","14,634,911","15,055,774" +,"15,127,534","14,458,899","14,403,378","14,566,425","14,644,406","14 +,524,069" Expired,245,275,273,248,240,295,353,316,371,398,387,352,310,288,405,27 +4,270 Less in,90.12%,90.49%,90.04%,89.55%,90.09%,90.63%,90.37%,90.48%,90.73% +,90.59%,90.83%,90.40%,88.82%,90.71%,90.72%,90.69%,91.04% [download] Output: [Submitted] [696,028] [50,946] [810,590] [836,505] [13,923,241] [13,77 +6,443] [14,179,619] [14,614,558] [14,704,885] [14,634,911] [15,055,77 +4] [15,127,534] [14,458,899] [14,403,378] [14,566,425] [14,644,406] [ +14,524,069] [Expired] [245] [275] [273] [248] [240] [295] [353] [316] [371] [398] +[387] [352] [310] [288] [405] [274] [270] [Less in] [90.12%] [90.49%] [90.04%] [89.55%] [90.09%] [90.63%] [90.37 +%] [90.48%] [90.73%] [90.59%] [90.83%] [90.40%] [88.82%] [90.71%] [90 +.72%] [90.69%] [91.04%] [download] Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!	[reply] [d/l] [select]
Re: I would like to find a good logic to parse the data by perleager (Pilgrim) on May 25, 2007 at 04:15 UTC
If you can't work around to getting a module to do this, perhaps you can just use this logic: Read the Input file line by line, and if its the first line parse it in a different way. If not, then parse it normally by splitting the commas. So if it detects its the first line, which is the "Submitted" values, then figure out some parsing method to read and print the values accordingly. Perhaps use this logic: `($junk, $submit_values) = split(/Submitted,\"/, $first_line);` [download] Then you'll be left with: `696,028","50,946","810,590","836,505","13,923,241","13,776,443","14,17 +9,619","14,614,558","14,704,885","14,634,911","15,055,774" ,"15,127,534","14,458,899","14,403,378","14,566,425","14,644,406","14 ,524,069"` [download] Then you can parse the above by by splitting `","`: `my @values = split(/\",\"/, $submit_values); foreach my $v (@values) { print $v; }` [download] perleager	[reply] [d/l] [select]
Re: I would like to find a good logic to parse the data by greatshots (Pilgrim) on May 25, 2007 at 03:02 UTC
ooops, In our production server I am not allowed to load any modules. I need to use this parsing scripts, in our production server. thanks for the Idea. I will look into Text::CSV_XS module, and try my best.	[reply]
Re^2: I would like to find a good logic to parse the data by GrandFather (Saint) on May 25, 2007 at 03:14 UTC
You may be able to get around the letter of the law by copying the important parts of Text::CSV into your code rather than installing the whole thing. DWIM is Perl's answer to Gödel	[reply]
Re^2: I would like to find a good logic to parse the data by varian (Chaplain) on May 25, 2007 at 08:38 UTC
I am not allowed to load any modules Core modules do not have to be installed so you may benefit from Text::Balanced. Below code is not bullit proof but should be sufficient to process your data: #!/usr/bin/perl use strict; use warnings; use Text::Balanced qw(extract_quotelike); sub getfields { my ($str) = @_; my @fields; my $field = ''; while ($str) { $field .= $str =~ s/^(\s*)// ? $1 :''; my $extracted; if ($str=~/^["']/) { ($extracted,$str) = extract_quotelike($str); $field.=$extracted; } else { ($extracted,$str) = split(',',$str,2); push @fields,$field.$extracted; $field=''; } } return @fields; } while (my $line = <DATA>) { chomp($line); print "$_\t" foreach ( getfields($line) ); print "\n"; } __DATA__ Submitted,"696,028","50,946","15,127,534","14,458,899" Expired,245,275,273,248 Less in,90.12%,90.49%,90.04%,89.55% [download] Output is: `Submitted "696,028" "50,946" "15,127,534" Expired 245 275 273 248 Less in 90.12% 90.49% 90.04% 89.55%` [download]	[reply] [d/l] [select]
Re^2: I would like to find a good logic to parse the data by Tux (Canon) on Jun 01, 2007 at 09:44 UTC
You don't need to have write-access to the main perl installation tree at all: `# perl Makefile.PL PREFIX=/home/greatshots/perl5 # make test # make install UNINST=1` [download] Do so with all modules you need, and add the PATH's to your env `$PERL5LIB` Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]


go ahead... be a heretic
	PerlMonks