in reply to Descriptive Stats from .csv file

I have found the Data::Table to be a nice module for these types of tasks. It borrows some ideas from R that are helpful for handling tabular data (it has an implementation of the melt and cast functions inspired by the R reshape module). You can accomplish your task with the following code (assuming that your data is saved in a file named data.csv). The author of Data::Table has also made additional documentation/information available here: https://sites.google.com/site/easydatabase/

Data in data.csv:

Month,Zone,Replicate,SpeciesA,SpeciesB,SpeciesC Sept,1,1,5,10,15 Sept,1,2,0,5,10 Sept,1,3,5,0,5 Sept,2,1,5,5,5 Sept,2,2,10,15,10 Sept,2,3,0,0,5
#!/usr/bin/env perl use strict; use warnings; use Data::Table; use Statistics::Lite qw(mean stddev); my $dt = Data::Table::fromCSV("data.csv"); print "Original Data Table\n"; print "===================\n"; print $dt->tsv; print "\n\n"; my $melt = $dt->melt(['Month', 'Zone', 'Replicate']); print "Melt Table\n"; print "==========\n"; print $melt->tsv; print "\n\n"; my $cast_mean = $melt->cast( ['Month', 'Zone'], 'variable', Data::Table::STRING, 'value', \&mean ); print "Cast (mean)\n"; print "===========\n"; print $cast_mean->tsv; print "\n\n"; my $cast_stddev = $melt->cast( ['Month', 'Zone'], 'variable', Data::Table::STRING, 'value', \&stddev ); print "Cast (stddev)\n"; print "=============\n"; print $cast_stddev->tsv; exit;
This code will give the following output:
Original Data Table =================== Month Zone Replicate SpeciesA SpeciesB SpeciesC Sept 1 1 5 10 15 Sept 1 2 0 5 10 Sept 1 3 5 0 5 Sept 2 1 5 5 5 Sept 2 2 10 15 10 Sept 2 3 0 0 5 Melt Table ========== Month Zone Replicate variable value Sept 1 1 SpeciesA 5 Sept 1 1 SpeciesB 10 Sept 1 1 SpeciesC 15 Sept 1 2 SpeciesA 0 Sept 1 2 SpeciesB 5 Sept 1 2 SpeciesC 10 Sept 1 3 SpeciesA 5 Sept 1 3 SpeciesB 0 Sept 1 3 SpeciesC 5 Sept 2 1 SpeciesA 5 Sept 2 1 SpeciesB 5 Sept 2 1 SpeciesC 5 Sept 2 2 SpeciesA 10 Sept 2 2 SpeciesB 15 Sept 2 2 SpeciesC 10 Sept 2 3 SpeciesA 0 Sept 2 3 SpeciesB 0 Sept 2 3 SpeciesC 5 Cast (mean) =========== Month Zone SpeciesA SpeciesB SpeciesC Sept 1 3.33333333333333 5 10 Sept 2 5 6.66666666666667 6.66666666666667 Cast (stddev) ============= Month Zone SpeciesA SpeciesB SpeciesC Sept 1 2.88675134594813 5 5 Sept 2 5 7.63762615825973 2.88675134594813

Replies are listed 'Best First'.
Re^2: Descriptive Stats from .csv file
by korsmo (Initiate) on Feb 03, 2014 at 18:14 UTC
    Hi,
    That seemed to work well for obtaining the values I need - thanks! I have loaded the module Excel::Writer::XLSX to try to bring those values into a .xlsx file. Do you know if I need to put the cast_means and cast_stddev values into an array before I can write the .xlsx file?
    Thanks again,
    BK

      You can use the Data::Table::Excel module to put the contents of a Data::Table object into an Excel file. From the documentation,

      This perl package provide utility methods to convert between an Excel file and Data::Table objects. It then enables you to take advantage of the Data::Table methods to further manipulate the data and/or export it into other formats such as CSV/TSV/HTML, etc.

      For example, this code will create an xlsx file that contains both the mean and stddev tables:

      #!/usr/bin/env perl use strict; use warnings; use Data::Table; use Data::Table::Excel qw(tables2xlsx); use Statistics::Lite qw(mean stddev); my $dt = Data::Table::fromCSV("data.csv"); my $melt = $dt->melt(['Month', 'Zone', 'Replicate']); my $cast_mean = $melt->cast( ['Month', 'Zone'], 'variable', Data::Table::STRING, 'value', \&mean ); my $cast_stddev = $melt->cast( ['Month', 'Zone'], 'variable', Data::Table::STRING, 'value', \&stddev ); tables2xlsx("descriptive_stats.xlsx", [ $cast_mean, $cast_stddev ]); exit;