search and delete comma delimited file

sea has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: search and delete comma delimited file by Limbic~Region (Chancellor) on Dec 08, 2004 at 23:25 UTC
Simone, Checking the CPAN is never a bad idea when trying to solve a new problem. Text::xSV is one of a handful of modules that make this easy. #!/usr/bin/perl use strict; use warnings; use Text::xSV; my $csv = Text::xSV->new( fh => *DATA ); $csv->bind_header; while ( $csv->get_row() ) { my $row = $csv->extract_hash(); print $row->{Domain}, "\n"; } __DATA__ "Column 1","Company Name","Column 2","Domain","Status","Start Date","E +xpiration Date","Renew Date" "1495107228781574","SomeCompany, Inc.","8335788590132972","kerpow.org" +,"Active", "Jan 27 2000 9:02PM","May 1 2004 12:34PM","May 28 2005 2:02AM" "1495107228781574","SomeCompany, Inc.","1802975834647670","squish.com" +,"Active", "Jul 15 2004 1:49PM","May 2 2004 11:05PM","May 19 2005 11:05PM" [download] It should not be hard to take this as an example and use the documentation to get what you want, but I will leave that up to you. Cheers - L~R	[reply] [d/l]
Re: search and delete comma delimited file by diotalevi (Canon) on Dec 08, 2004 at 23:32 UTC
Try this on for size. It reads your file using the standard Text::CSV_XS module and prints just the domain field. `use strict; use warnings; use constant { FIELD_Unknown_1 => 0, FIELD_CompanyName => 1, FIELD_Unknown_2 => 2, FIELD_Domain => 3, FIELD_Status => 4, FIELD_Unknown_3 => 5, FIELD_Unknown_4 => 6, FIELD_Unknown_5 => 7, }; my $parser = Text::CSV_XS->new(); while ( my $line = <> ) { $parser->parse( $line ); my @columns = $parser->fields; my @new_columns = @columns[ FIELD_Domain, ]; $parser->combine( @new_columns ); print $parser->string; }` [download]	[reply] [d/l]
Re: search and delete comma delimited file by jZed (Prior) on Dec 08, 2004 at 23:26 UTC
If you want to access the file as if it were a database, you can use DBD::CSV which will allow DBI/SQL access to multiple CSV files in a variety of formats, including the one you show (embedded commas in quote-delimited fields). If you aren't a SQL person, then Text::CSV_XS will parse the file for you.	[reply]
Re: search and delete comma delimited file by tall_man (Parson) on Dec 08, 2004 at 23:53 UTC
In this case (grabbing one inner field), you could avoid the fancy modules: #!/usr/bin/perl use strict; use warnings; while (<DATA>) { my $domain = (split /","/,$_,5)[3]; print $domain,"\n"; } __DATA__ "1495107228781574","SomeCompany, Inc.","8335788590132972","kerpow.org" +,"Active","Jan 27 2000 9:02PM","May 1 2004 12:34PM","May 28 2005 2:02 +AM" "1495107228781574","SomeCompany, Inc.","1802975834647670","squish.com" +,"Active","Jul 15 2004 1:49PM","May 2 2004 11:05PM","May 19 2005 11:0 +5PM" "1495107228781574","SomeCompany, Inc.","4430459098697337","wanka.com", +"Active","Nov 9 2004 9:29AM","May 3 2004 2:14PM","May 16 2005 2:14PM" "1495107228781574","SomeCompany, Inc.","2113292909703608","blah.com"," +Active","Jan 16 2003 10:48AM","May 4 2004 12:34PM","May 16 2005 5:48A +M" [download] Note: this one works even if some company names have commas and the others don't.	[reply] [d/l]
Re: search and delete comma delimited file by McMahon (Chaplain) on Dec 08, 2004 at 23:27 UTC
There are many ways to do this, but I really like DBD::CSV. The SQL interface is really really powerful for this sort of data wrangling.	[reply]
Re: search and delete comma delimited file by Popcorn Dave (Abbot) on Dec 08, 2004 at 23:52 UTC
Another thing you might try - and this is only if your data holds as your examples - is something I did for a stock tracking program. The trick is though that your data is always in the same format. Just split it and grab what you need. In my case, I'm getting static stocks data from Yahoo Finance and I expect it to be in a certain order, so I just split it, and only take what I'm after. If your data is going to change, then this isn't going to do it for you. HTH #!/usr/bin/perl use strict; use warnings; my @results; my $relevant; while (<DATA>){ (undef, undef, undef, $relevant) = split ('","',$_); push(@results, $relevant); } print join("\n", @results); __DATA__ "1495107228781574","SomeCompany, Inc.","8335788590132972","kerpow.org" +,"Active","Jan 27 2000 9:02PM","May 1 2004 12:34PM","May 28 2005 2:02 +AM" "1495107228781574","SomeCompany, Inc.","1802975834647670","squish.com" +,"Active","Jul 15 2004 1:49PM","May 2 2004 11:05PM","May 19 2005 11:0 +5PM" "1495107228781574","SomeCompany, Inc.","4430459098697337","wanka.com", +"Active","Nov 9 2004 9:29AM","May 3 2004 2:14PM","May 16 2005 2:14PM" "1495107228781574","SomeCompany, Inc.","2113292909703608","blah.com"," +Active","Jan 16 2003 10:48AM","May 4 2004 12:34PM","May 16 2005 5:48A +M" [download] Output: kerpow.org squish.com wanka.com blah.com Useless trivia: In the 2004 Las Vegas phone book there are approximately 28 pages of ads for massage, but almost 200 for lawyers.	[reply] [d/l]
Re^2: search and delete comma delimited file by sea (Initiate) on Dec 09, 2004 at 14:13 UTC
Popcorn Dave, Okay, this absolutely sits side by side with the sunrise, that I so was lucky to be able to witness this morning... I'm going to walk through the other responses as well so I can see the different ways to approach and visualize this data! THANK YOU! :) Oh yeah, I also wanted to give you a blurb of, 'this is was what I did to make this work on my end.' I took this line of code: while (<DATA>){ and changed it to this: while (<>){ so I could make the module read from STDIN and direct it to the new file: ./wicked.pl thefile.txt > thedataIneed.txt and like I said above...WOW!! What a beautiful sunrise! Simone~	[reply]
Re^2: search and delete comma delimited file by jZed (Prior) on Dec 09, 2004 at 14:40 UTC
split ('","',$_); While this split works on the data set presented, it is definitely not a good general solution for CSV parsing. There are at least two kinds of valid CSV that will break it. In the first place, many CSV generators omit quotation marks around fields that don't contain embedded special characters so this is a valid three-field CSV record in braces: {foo,"a,b",7} in which the quote-comma-quote split would fail because there are not quotes around some of the field-separating commas. Also, if there are embedded quote marks in the data, those are usually escaped by doubling the quote marks so this is also a valid three-field CSV record: {7,"a"",""b",8}, in which the split would fail because it would find a non-field-separating quote-comma-quote pattern.	[reply]
Re^3: search and delete comma delimited file by Popcorn Dave (Abbot) on Dec 10, 2004 at 06:12 UTC
No, you're right, I see that now, and that's a very good point. I was just trying to show something that I had used, with the caveat that it may or may not work for the OP's data. In my particular case, the data is just CSV so just a split on ',' with the undefs works. Useless trivia: In the 2004 Las Vegas phone book there are approximately 28 pages of ads for massage, but almost 200 for lawyers.	[reply]