Re: Comma Delimited File

Having had bad experience with Text::CSV_XS (I'm guessing it was probably me, not the module ;o) I tend to use hand-rolled solutions if it's nice'n'simple (I mean *really* simple), otherwise it's best to rely on the tried-and-tested APIs available. So anyway here's my meagre effort -

use strict;

$_ = <DATA>;
my $cnt = tr/,/,/ + 1;

my @fields;
push @fields, split /,/ while <DATA>;
chomp(@fields);

my %data;
push @{$data{$_ % $cnt}}, $fields[$_] for 0..@fields-1;

my @row;
for (keys %data) {
        @row = sort @{$data{$_}};
        print "$_ - @row\n";
}

__END__
num1,num2,num3
123,345,578
345,349,340
12,348,023
534,90283,230
239,394,283
[download]

What this does is

counts the amount of fields (which in turn assumes theres a header line)
stores all the data in a flat array
get's rid of those danged newlines
creates a hash with the columns as the keys
then just loops through the data and sorts each column

Sorry if this is a little direct in approach, but if you're rolling your own it might nudge you in the right direction (and I was looking for a challenge, simple though it is :o)
HTH

broquaint

Update: it's probably best to go with davorg's approach if you actually want to implement this, but I'll leave this here in the spirit of TMTOWTDI.

Comment on Re: Comma Delimited File Download Code

Replies are listed 'Best First'.
Re: Re: Comma Delimited File by davorg (Chancellor) on Nov 07, 2001 at 21:36 UTC
Your code has a pretty fundemental flaw where it assumes that all commas in the data are there to separate fields. CSV files allow commas to exist in quoted fields. That's why `Text::CSV_XS` (or even `Text::ParseWords`) are far better solutions. But, if I knew that this wasn't going to be an issue, I'd do it something like this: `$_ = <DATA>; chomp; my @hdrs = split /,/; my @recs; while (<DATA>) { chomp; my %rec; @rec{@hdrs} = split /,/; push @recs, \%rec; }` [download] That seems, to me, to create a far more useful data structure - an array of records, with each record being a reference to a hash where the keys are the header fields and the values are the associated data values. Here's an example that then sorts the data on the second value in each record: `print "num1\tnum2\tnum3\n"; foreach (sort { $a->{num2} <=> $b->{num2} } @recs) { print "$_->{num1}\t$_->{num2}\t$_->{num3}\n"; }` [download] -- <http://www.dave.org.uk> "The first rule of Perl club is you don't talk about Perl club."	[reply] [d/l] [select]
Re: Comma Delimited File by mikeB (Friar) on Nov 07, 2001 at 21:19 UTC
Could you fill us in on the problems you've had with Text::CSV_XS? I've had no trouble so far using it with files generated by Excel.	[reply]
Re: Re: Comma Delimited File by broquaint (Abbot) on Nov 07, 2001 at 23:16 UTC
As I said, the problem was probably me and not Text::CSV_XS. But so as not to be elusive I'll fill you in as to why I gave up on it. Firstly I was working on something that had to done as fast as humanly (or inhumanly as the case sometimes is) possible, so when I encountered the problems with Text::CSV_XS I didn't spend too much time on figuring out what wrong. I wasn't doing anything particularly fancy or neat, just formatting some data from a TDF. But for some reason it would only find 4 of 9 records, and I can't see why. There wasn't anything too gruesome in the data (as it was created to be as friendly as possible) and splitting it on "\t" seemed to work a treat so I stuck with that. If you're interested in the code it was something like this - `use strict; use Text::CSV_XS; my $tdf = Text::CSV_XS->new({'sep_char'=>"\t"}); open(TDF, "somedata.tdf") or die("b0rk - $!"); my @fields = (); while($tdf->parse(scalar <TDF>)) { @fields = $tdf->fields(); print "@fields\n"; }` [download] That seems to work fine on some simple data, but it broke on the data I was feeding it, so I'm guessing that the data was the root of the problem (and I would've dug further but I was under a wee bit of pressure ;o). I should imagine I'll use it if the need ever arises again, and hopefully the data behaves itself this time (assuming it was the data ...) HTH broquaint	[reply] [d/l]