Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

how can I manipulate data from csv file as below

week test test_end_date_time

201120 1 05-10-2011_16:47

201120 0 05-10-2011_16:48

201120 1 05-10-2011_16:49

201121 0 05-10-2011_15:47

201121 1 05-10-2011_15:48

201121 1 05-10-2011_15:49

201121 1 05-10-2011_15:50

note : when test is 1 mean pass, 0 mean fail

I waant use a perl script to calculate it so it become

week volume yield

201120 3 66.7%

201121 4 75.0%


Thank you very much

Replies are listed 'Best First'.
Re: how do I manipulate data in csv file
by Tanktalus (Canon) on Jul 26, 2011 at 03:36 UTC

    What have you tried? Does it have to be the same file (not sure if that's a requirement or not)?

    Personally, I'd use DBD::CSV. And then the whole thing becomes a trivial SQL query from your source file and a simple SQL insert to your output file. It is a bit more to set up, but since I set up DBD::CSV from time to time (I use it in a few projects), it becomes a bit more mundane (normal or ordinary). And I avoid a whole lot of work by letting SQL::Statement and friends do all my work for me. And then it's also an easier transition to a real database (MySQL, PostgreSQL, DB2, Oracle, SQLServer, or a more-real database like SQLite) as your needs scale.

      #!/usr/bin/perl use Text::ParseWords; + use Data::Dumper; + use IO::File; #my $file = test.csv; + my $file = $ARGV[0]; my ($output_file) = ($file =~ /(.*)\./); + open(my $fh, '<', $file) or die "Can't read csv file '$file' [$!]\n"; + my $countlines = 0; my $countlines1 = 0; while (my $line = <$fh>) {while (my $line = <$fh>) { chomp $line; my @fields = Text::ParseWords::parse_line(',', 0,$line); + if($countlines >= 0 && @fields[0]==201120) {if ( @fields[1]==1){ $countlines1 = $countlines1 + 1;} $countlines++; } } print "$countlines\n"; print "$countlines1\n"; $yield = ($countlines1 / $countlines)*100 ; print $yield; }

      Thank for your reply,it does not have to be the same csv file. Currently , my code able read one week only and print it volume and yield. I need some guides from you all.

        First, use Text::CSV_XS instead of Text::ParseWords - it reads more like the spec that way (having your code match your spec is always good in my books). Second, I don't see any commas in your original data, so there is an obvious mismatch.

        Next, you are hardcoding your week number in your code. Don't do that. Accumulate it in a hash. I would even go so far as use a hash of hashes with separate fields for the sum and count.

        Also, learn to use proper indentation. Your code is quite simply a mess. It's very difficult to follow nesting, and would get impossible to follow fairly quickly.

        I also see a while inside a while, both reading from the same file. That's nonsensical. Instead, just read and discard a single line: scalar <$fh> should do it.

        my %data; my $csv = Text::CSV_XS->new(); scalar <$fh>; while (my $row = $csv->getline($fh)) { $data{$row->[0]}{sum} += $row->[1]; $data{$row->[0]}{count}++; } for my $weeknum (sort keys %data) { printf "%s %d %.1f\n", $weeknum, $data{$weeknum}{sum} / $data{$weeknum}{count} +, $data{$weeknum}{count}; }
        That should give you enough to go on.

        Update: s/comments/comma/ in the first paragraph

Re: how do I manipulate data in csv file
by jwkrahn (Abbot) on Jul 26, 2011 at 04:02 UTC
    $ echo " 201120 1 05-10-2011_16:47 201120 0 05-10-2011_16:48 201120 1 05-10-2011_16:49 201121 0 05-10-2011_15:47 201121 1 05-10-2011_15:48 201121 1 05-10-2011_15:49 201121 1 05-10-2011_15:50 " | perl -ane'$c{$F[0]}{T}++;$c{$F[0]}{C}+=$ +F[1]}{printf"%s %d %.1f%%\n",$_,$c{$_}{T},$c{$_}{C}/$c{$_}{T}*100 for + sort keys%c' 201120 3 66.7% 201121 4 75.0%

      hi thank for your reply, actually I am not really understand your code, I am just a beginner in learning perl. can i know what if i have a csv file like below?

      week power test Test_end_date_time 201120 5 1 05-10-2011_16:47 201120 699 17 05-10-2011_16:48 201120 454 1 05-10-2011_16:49 201121 5687 26 12-10-2011_15:47 201121 211 1 12-10-2011_15:48 201121 1231 36 12-10-2011_15:49 201121 171 1 12-10-2011_15:50 201121 677 57 12-10-2011_15:51 201121 178 1 12-10-2011_15:52 if test = 1 mean pass, other than 1 is fail I wish the output in another csv is like below, any guides for me? week volume yield 201120 3 66.7% 201121 6 50.0%
        $ echo " week power test Test_end_date_time 201120 5 1 05-10-2011_16:47 201120 699 17 05-10-2011_16:48 201120 454 1 05-10-2011_16:49 201121 5687 26 12-10-2011_15:47 201121 211 1 12-10-2011_15:48 201121 1231 36 12-10-2011_15:49 201121 171 1 12-10-2011_15:50 201121 677 57 12-10-2011_15:51 201121 178 1 12-10-2011_15:52 " | perl -ane'/^\D/&&next;$c{$F[0]}{T}++;$c{$F[0]}{C}+=$F[2]==1}{print +f"%s %d %.1f%%\n",$_,$c{$_}{T},$c{$_}{C}/$c{$_}{T}*100 for sort keys% +c' 201120 3 66.7% 201121 6 50.0%
Re: how do I manipulate data in csv file
by Anonymous Monk on Jul 26, 2011 at 03:04 UTC
Re: how do I manipulate data in csv file
by onelesd (Pilgrim) on Jul 26, 2011 at 06:09 UTC