Re: merging .csv text input based on matching column field values in multiple rows

Replies are listed 'Best First'.
Re^2: merging .csv text input based on matching column field values in multiple rows by abrg (Initiate) on Feb 25, 2009 at 18:40 UTC
beth, countzero, wol thanks for your help/encouragement. before i go much further, i should post the actual working code i currently have which has been modified to fit the input/output scenario above. #!/usr/bin/perl use strict; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new(); my $file = ''; if (defined $ARGV[0]) { $file = $ARGV[0]; } open(my $data, '<', $file) or die "Could not open '$file'\n"; while (my $line = <$data>) { if ($csv->parse($line)) { my @columns = $csv->fields(); ## set vars my $orderNumber = $columns[0]; my $name = $columns[1]; my $itemId = $columns[2]; my $itemDesc = $columns[3]; my $price = $columns[4]; open (OUTFILE, "> output/$orderNumber.xml") or die $! . " can' +t open the file\n"; #print OUTFILE <<XML; print OUTFILE "<order>\n"; print OUTFILE " <order_id>$orderNumber</order_id>\n"; print OUTFILE " <name>$name</name>\n"; print OUTFILE " <item>\n"; print OUTFILE " <item_id>$itemId</item_id>\n"; print OUTFILE " <item_desc>$itemDesc</item_desc>\n"; print OUTFILE " <price>$price</price>\n"; print OUTFILE " </item>\n"; print OUTFILE "</order>\n"; #XML close (OUTFILE); } else { warn "Line could not be parsed:\n"; } } [download] i've read the sort documentation but i'm still having trouble wrapping my brain around how to actually execute it with the Text::CSV module (forgive me, i realize this is a pretty basic concept). the File::Sort module seems interesting, though, i feel i should be able to solve this natively in PERL (also encountering some strange issues when trying to run this module). essentially, how do i get at the file array and then look inside it to the first column's value on each row? it would appear that my input file is being defined in scalar context where it should be an array. am i getting warmer? ;) thanks again	[reply] [d/l]
Re^3: merging .csv text input based on matching column field values in multiple rows by ELISHEVA (Prior) on Feb 25, 2009 at 19:28 UTC
Showing your code was definitely a good move. Now I have a better idea of why you are finding how to use sort so confusing. To "sort and group" you'll need three passes through the file lines: one to read in the lines, one to sort, and one to print. You'll also need to store the data from the "reading in" phase in an an array of array references. I've included some code demonstrating the algorithm. You'll have to make adaptions to fit your situation, but it should at least show you how to use the sort-n-group algorithm. I don't have the CVS module you are using so I've done a simple chomp and split to emulate it. You'll want to substitute that part with your code to read in CVS. I've also read in sample lines from <DATA> and printed to STDOUT for convenience. It also should make it easy to just download the code and see if it does indeed do what you want. Read more... (3 kB) Best, beth	[reply] [d/l]
Re^3: merging .csv text input based on matching column field values in multiple rows by wol (Hermit) on Feb 26, 2009 at 13:31 UTC
The "sort and group" principle shown in other nodes is sound, but it makes most sense when you're only looking at arrays. Here's an implementation that uses both arrays and hashes, to make an internal perl representation of the data that corresponds more closely (compared to a just arrays) to what you need to output. The output loop therefore becomes a bit simpler. Read more... (4 kB) -- use JAPH; print JAPH::asString();	[reply] [d/l] [select]