techjohnny has asked for the wisdom of the Perl Monks concerning the following question:

I have an order report that is generated in Crystal in this format:


CUSTA
1/2015 100
10/2015 1,000
12/20015 1,000
CUSTB
2/2015 100
10/2015 1,000
12/2015 100

I usually take this report and manually copy and paste transpose into a new worksheet so it will display in columns with all the customers in rows and the months across the top. I have used Perl in the paste to parse and reformat data before, but without help from the monks. I was hoping to get some help again?

Since some months do not have sales, that month should report either blank or 0. So far I've simply created an array with:

@months = (jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec); for $m (@months) { print $m,"\t"; } print "\n";

This simply only creates the columns across the top, but the numbers is where it's tricky. I'm guessing it would look for MM/YYYY and data and enter that information or skip and fill in MM/YYYY 0 Thanks in advance.

Replies are listed 'Best First'.
Re: Reformat RAW Excel Data
by GotToBTru (Prior) on Jan 11, 2016 at 20:15 UTC

    Good start. You will need to be able to match month names (jan, feb, etc) to month numbers in your dates. You can simplify that by filling the 0th position in the array with a dummy value. Then the month number will equal the index into the array.

    Next you need to design the data structure to hold your report. Here's an idea: a hash keyed by the customer name containing a reference to an array of sales by month.

    use strict; use warnings; use Data::Dumper; my @months = qw/dummy jan feb mar apr may jun jul aug sep oct nov dec/ +; my (%report,$cust,$value,$date); while (<DATA>) { chomp; if (/^[A-Z]/) { $cust = $_; next; } ($date, $value) = split ' ',$_; my ($mon, $year) = split '/',$date; $report{$cust}[$mon] = $value; } print Dumper(\%report); __DATA__ CUSTA 1/2015 100 10/2015 1,000 12/2015 1,000 CUSTB 2/2015 100 10/2015 1,000 12/2015 100

    This is incomplete, of course. For instance, I'm ignoring the year values in your data.

    But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

      This only populates the array up to the index corresponding to the highest value of $mon.

      __DATA__ CUSTA 1/2015 100 12/2015 1,000 CUSTB 2/2015 100 3/2015 100
      Output:
      $VAR1 = { 'CUSTA' => [ undef, '100', undef, undef, undef, undef, undef, undef, undef, undef, undef, undef, '1,000' ], 'CUSTB' => [ undef, undef, '100', '100' ] };

      edit: provided working code

      You can pre-populate the array of month values for each customer as you find it:

      # note: # # @{ ... } turns the hash entry for this customer into an array. # # map creates a new list consisting of one element, undef, for eac +h # element from the list it is given, i.e. @months
      #!/usr/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Sortkeys = 1; my @months = qw/dummy jan feb mar apr may jun jul aug sep oct nov dec/ +; my %report; my $cust; while (<DATA>) { chomp; if ( /^[A-Z]/ ) { $cust = $_; @{ $report{ $cust } } = map { undef } @months; next; } my ($date, $value) = split ' ', $_; my ($mon, $year) = split '/', $date; $report{$cust}[$mon] = $value; } print Dumper(\%report); __DATA__ CUSTA 1/2015 100 12/2015 1,000 CUSTB 2/2015 100 3/2015 100
      Output:
      $VAR1 = { 'CUSTA' => [ undef, '100', undef, undef, undef, undef, undef, undef, undef, undef, undef, undef, '1,000' ], 'CUSTB' => [ undef, undef, '100', '100', undef, undef, undef, undef, undef, undef, undef, undef, undef ] };

      Having said all that, however, this problem seems to be ill-suited for using an array. I would store the values in a hash and have the code that reads the hash (probably looping through @months) print 'undef' or a blank or whatever, if there's no value for the given months.


      The way forward always starts with a minimal test.
        First, thank you so much for the response and assistance. Perlmonks never fails to amaze me!
        Here's the code I finally come up with. Those who use Crystal reports will know of the limitations of data printing in rows and not columns.
        #!/bin/perl my @months=qw/"" jan feb mar apr may jun jul aug sep oct nov dec/; open(DATAS,"<","sales.txt") ; my (%report,$cust,$value,$date); for (@months) { print "$_","\t"; } print "\n"; while (<DATAS>) { chomp; if (/^[A-Z]/) { $cust = $_; @{$report{$cust}} = map { 0 } @months; next; } ($date,$value) = split ' ', $_; my ($mon,$year) = split '/',$date; $report{$cust}[$mon]=$value; } foreach my $keys (keys %report) { print $keys,"\t"; for (0..12) { print $report{$keys}[$_],"\t"; } print "\n"; } print "\n";
Re: Crystal Reports to CSV
by FreeBeerReekingMonk (Deacon) on Jan 12, 2016 at 01:40 UTC
    #!/usr/bin/perl #use warnings; use strict; my @months = ( '','jan','feb','mar','apr','may','jun','jul','aug','sep +','oct','nov','dec' ); my @CUSTOMERS; # retain order of appearance in Crystal reports file. my %DATA; # hash, as suggested by fellow monks my $customer = ""; # holds current customer my ($month, $day, $value); while(<DATA>){ chomp; next unless $_; # skip empty lines # sorry, others did a better job with a simple split. if( m{^\s*(\d+)/(\d+)\s+([\d,\.]+)} ){ # matches a line with date and value ($month,$day,$value) = ($1, $2, $3); # 0+$month transforms string 02 into value 2 $DATA{$customer}{0+$month} = $value; }else{ $customer = $_; # next customer push @CUSTOMERS, $customer; # map{ $DATA{$customer}{$_}=0 } (1..12) # uncomment for zero's } } print join(";", @months). "\n"; for my $customer (@CUSTOMERS){ print $customer .";" . join( ";", map{ $DATA{$customer}{$_} } (1..12) ) . "\n"; } __DATA__ CUSTA 1/2015 100 10/2015 1,000 12/2015 1,000 CUSTB 02/2015 200 10/2015 1,000 12/2015 100

    output:

    ;jan;feb;mar;apr;may;jun;jul;aug;sep;oct;nov;dec CUSTA;100;;;;;;;;;1,000;;1,000 CUSTB;;200;;;;;;;;1,000;;100
Re: Reformat RAW Excel Data
by FreeBeerReekingMonk (Deacon) on Jan 12, 2016 at 02:07 UTC

    BOFH version

    #!/usr/bin/perl print join(";", qw(2015 jan feb mar apr may jun jul aug sep oct nov de +c)); while(<DATA>){ chomp; print ( s/^(\d+)\/\d+\s+// ? (";" x ($1-$c)).";$_" : ($c=0) || "\n$_" ) && ($c=$c?$1+1:1); } print "\n"; __DATA__ CUSTA 1/2015 100 10/2015 1,000 12/2015 1,000 CUSTB 02/2015 200 10/2015 1,000 12/2015 100

    Output:

    2015;jan;feb;mar;apr;may;jun;jul;aug;sep;oct;nov;dec CUSTA;100;;;;;;;;;1,000;;1,000 CUSTB;;200;;;;;;;;1,000;;100

    edit: small typo