Parsing a string

kepler has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Parsing a string by CountZero (Bishop) on Feb 22, 2011 at 16:34 UTC
Text::CSV, don't even think of anything else. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re^2: Parsing a string by ikegami (Patriarch) on Feb 22, 2011 at 16:47 UTC
I'd think of Text::CSV_XS :) At best, Text::CSV is a needless intermediary. (Upd: To clarify, using Text::CSV might be wiser if you're distributing a non-XS module or application, but for your own code, it can only introduce problems. )	[reply]
Re^2: Parsing a string by tilly (Archbishop) on Feb 22, 2011 at 17:42 UTC
Why not think of Text::xSV?	[reply]
Re^3: Parsing a string by Anonymous Monk on Feb 22, 2011 at 18:00 UTC
One reason could be no support for blank_is_undef/ empty_is_undef :)	[reply]
Re^4: Parsing a string by tilly (Archbishop) on Feb 22, 2011 at 18:46 UTC
Re^3: Parsing a string by CountZero (Bishop) on Feb 23, 2011 at 07:07 UTC
My bad tilly, I have never used your Text::xSV before, but if the opportunity presents itself, I will give it a spin. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re: Parsing a string by BrowserUk (Patriarch) on Feb 22, 2011 at 16:41 UTC
Like this? `$s = q[ci,14938340,2,"Monday, February 21, 2011 19:58:06 UTC",34.6953, +-118.5350,2.2,17.40, 9,"Southern California"];; print for $s =~ m[("[^"]+"\|[^,]+)(?:,\|$)]g;; ci 14938340 2 "Monday, February 21, 2011 19:58:06 UTC" 34.6953 -118.5350 2.2 17.40 9 "Southern California"` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: Parsing a string by kepler (Scribe) on Feb 22, 2011 at 17:26 UTC
Really nice :) Thanks - it's a wonderul solution and its module independent; I don't have the mentioned module in my webserver (do you believe that???) Kind regards, Kepler	[reply]
Re^3: Parsing a string by samarzone (Pilgrim) on Feb 23, 2011 at 07:17 UTC
Be aware that the given regular expression has a chance to be broken in some circumstances. Example: `$s = q[ci,14938340,2,"Monday, February 21, 2011 19:58:06 UTC",34.6953, +-118.5350,2.2,17.40, 9,"Southern California, \"US\""]; print for $s =~ m[("[^"]+"\|[^,]+)(?:,\|$)]g;` [download] Output `ci 14938340 2 "Monday, February 21, 2011 19:58:06 UTC" 34.6953 -118.5350 2.2 17.40 9 "Southern California \"US\""` [download] -- Regards - Samar	[reply] [d/l] [select]
Re^2: Parsing a string by Monkomatic (Sexton) on Feb 22, 2011 at 18:27 UTC
Yeah this was a really clean solution. Very nicely done. I was going to suggest something more complicated like the below for subtracting uniques. But you would have had to do it for each data type. `# CODE for finding a number field of a certain length if (my @matches = $datainstring =~ m{ ([0-9]{12}) }xmsg) { print qq{matched @matches};push(@match2, @matches);foreach my $elem + ( @match2 ) {next if $seen{ $elem }++;push @unique, $elem;}### GET U +NIQUES ##### } # IF #` [download]	[reply] [d/l]
Re: Parsing a string by kennethk (Abbot) on Feb 22, 2011 at 16:46 UTC
The easiest answer to your issue to suggest the use of one of many CSV modules on CPAN (Comma-separated values). My preference is for Text::CSV. A sample which does what you request: `#!/usr/bin/perl use strict; use warnings; use Text::CSV; my $csv = Text::CSV->new. or die "Cannot use CSV: ".Text::CSV->error_diag (); my @rows; open my $fh, "<&", *DATA or die "Clone failed"; while ( my $row = $csv->getline( $fh ) ) { push @rows, $row; } $csv->eof or $csv->error_diag(); print join "\n\n", map {join "\n", @$_} @rows; __DATA__ ci,14938340,2,"Monday, February 21, 2011 19:58:06 UTC",34.6953,-118.53 +50,2.2,17.40, 9,"Southern California"` [download]	[reply] [d/l]
Re: Parsing a string by johna (Monk) on Feb 22, 2011 at 18:38 UTC
I think Text::ParseWords (core module) should work as well: `#!/usr/bin/perl use strict; use warnings; use Text::ParseWords; my $s = q[ci,14938340,2,"Monday, February 21, 2011 19:58:06 UTC",34.69 +53,-118.5350,2.2,17.40, 9,"Southern California"]; print join "\n", parse_line(",", 1, $s); print "\n";` [download] Outputs: `ci 14938340 2 "Monday, February 21, 2011 19:58:06 UTC" 34.6953 -118.5350 2.2 17.40 9 "Southern California"` [download] -John	[reply] [d/l] [select]
Re: Parsing a string by Monkomatic (Sexton) on Feb 22, 2011 at 18:39 UTC
To BrowserUK. Do you have Something similar that would fix comma's for people who like to stick comma's in their address? `Input: Mark Williams 6/246 400 Albock road , Apt 2 West, Junction, 6 Alton, NM 60555 Output: Mark Williams 6/246 400 Albock road Apt 2 West Junction 6 Alton NM 60555` [download]	[reply] [d/l]
Re^2: Parsing a string by aartist (Pilgrim) on Feb 22, 2011 at 20:27 UTC
s/,//g; s/\s+/ /g;	[reply]