john.tm has asked for the wisdom of the Perl Monks concerning the following question:

Hi I have a script that coverts an xlsx file to a csv file, and then creates individual files based on column [0]. I then run another script that then coverts the new csv files back to xlsx and deletes the csv files. run a seperate scripts they produce what i want.

However i cannot get them to work if i combine them into a single script. The xlsx files are empty and the files are not deleted.

#!/usr/bin/perl use strict; use warnings; use Spreadsheet::BasicRead; use Excel::Writer::XLSX; local $" = "'\n'"; my @classes; our $Header_row; #Set-up Files my $xlsx_WSD = ( "C:\\Temp\\Data\\data.xlsx"),, 1; my @csvtemp; if ( -e $xlsx_WSD ) { my $ss = new Spreadsheet::BasicRead($xlsx_WSD) or die; my $col = ''; my $row = 0; while ( my $data = $ss->getNextRow() ) { $row++; $col= join( "\t", @$data ); push @csvtemp, $col . "\n" if ( $col ne "" ); } } my @arraynew; my %seen; $Header_row = shift (@csvtemp); # this is where the headers a +re stored. foreach (@csvtemp){ chomp; $_ = uc $_ ; $_ =~ s/,//g; +#removes commas $_ =~ s/\t/,/g; #ch +ange from tab to csv push @arraynew, $_ . "\n" if !$seen{$_}++ ; #remove any + dupes } $Header_row =~ s/\t/,/g ; my (%seena, %clss_out); my @class_file; foreach (@arraynew) { chomp; my ($clss_col, $data) = $_ =~ m{^([A-z]*),(.*)}; next if $seen{$clss_col}{$data}++; append_data($clss_col, $_, \%clss_out); } csv_xlsx (); #------------------------------------------------------------------- sub append_data { my ($clss_col, $line, $clss_out) = @_; if (! $clss_out->{$clss_col}) { open($clss_out->{$clss_col}, '>', "C:\\Temp\\Data\\$clss_col.csv +") or die "Unable to open '$clss_col.csv' for appending: $!\n"; print { $clss_out->{$clss_col} } $Header_row; push @class_file, "C:\\Temp\\Data\\$clss_col.csv"; } print { $clss_out->{$clss_col} } $line . "\n"; } sub csv_xlsx{ my $dir = "C:\\Temp\\Data"; foreach my $fp (glob("$dir/*.csv")) { open my $fh, "<", $fp or die "can't read open '$fp'"; my $xlsx = $fp =~ s\csv\xlsx\; my $workbook = Excel::Writer::XLSX->new("$fp"); my $worksheet = $workbook->add_worksheet(); my ( $x, $y ) = ( 0, 0 ); while (<$fh>) { my @list = split /,/; foreach my $c (@list) { $worksheet->write( $x, $y++, $c ); } $x++; $y = 0; } $workbook->close(); close $fh or die "can't read close '$fp'"; } unlink glob "C:\\Temp\\Data\\*.csv" or warn $! ; }
data
CLASS,PUPIL,M/F,Year ZEBRAS,JAMES,M,3 LIONS,AMIE,M,1 GIRAFFES,AMES,M,R MEERKATS,JOHN,M,4 LEOPARDS,JIM,M,2 HIPPOS,PETER,M6 MONKEYS,PELE,M,5 ZEBRAS,DAN,M,3 MONKEYS,MINNIE,F,6 HIPPOS,REG,M,6

Replies are listed 'Best First'.
Re: Scripts work when run individually but not when run as one compiled script
by Laurent_R (Canon) on Jan 19, 2016 at 23:13 UTC
    Your program would be much easier to read if you used consistent indentation. Sorry, but I just don't feel like taking the pain of trying to understand a program that is poorly formatted.

    Update (Jan 19, 2016 at 23:25) to the monk who downvoted this post: do you really think that consistent indentation is optional? I don't, and I would simply not hire a programmer who would not make the effort of indenting his or her code properly. Period. I am not complaining about indentation style, I don't really care about that (even if I have my own preferences), but I am complaining about inconsistent indentation. Indentation is not about making the script pretty, it is about making the intent of the programmer clear.

Re: Scripts work when run individually but not when run as one compiled script
by shmem (Chancellor) on Jan 20, 2016 at 10:00 UTC

    Apart from the hint given in Re: Scripts work when run individually but not when run as one compiled script by GotToBTru - three points:

    1. You mix path delimiters.
      sub csv_xlsx{ my $dir = "C:\\Temp\\Data"; foreach my $fp (glob("$dir/*.csv")) { ... } unlink glob "C:\\Temp\\Data\\*.csv" or warn $! ; }
      To avoid that, always use File::Spec for path operations.
    2. Did you check what those glob() expressions do return?
    3. What are you doing with @class_files ?

    And a last hint - to format your indenting before posting, you can use Perl::Tidy, which expands tabs and sets the shift width to 4 spaces, given the proper rules for that. ;-)

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: Scripts work when run individually but not when run as one compiled script
by soonix (Chancellor) on Jan 20, 2016 at 09:15 UTC

    I wouldn't phrase it as harsh as Laurent_R did, but this:

    Indentation is not about making the script pretty, it is about making the intent of the programmer clear.
    is definitely true.

    After applying perltidy, I don't see much suspicious (except perhaps that you push the file names to an array, and later get them from glob).
    Since you already applied the first bullet point of Basic debugging checklist, you should go on there.
Re: Scripts work when run individually but not when run as one compiled script
by poj (Abbot) on Jan 20, 2016 at 09:38 UTC

    Close the files you created before reading them

    close $clss_out{$_} for keys %clss_out; csv_xlsx();

    Also, consider deleting the csv file individually rather that using glob

    sub csv_xlsx{ my $dir = "C:\\Temp\\Data"; my @csv_files = glob "$dir/*.csv"; foreach my $csv (@csv_files) { (my $xlsx = $csv) =~ s/csv/xlsx/i; print "Creating $xlsx "; my $workbook = Excel::Writer::XLSX->new($xlsx); my $worksheet = $workbook->add_worksheet(); my $row = 0; open my $fh, '<', $csv or die "Can't open '$csv' : $!"; while (<$fh>) { my @list = split /,/; $worksheet->write_row( $row, 0, \@list ); ++$row; } print "$row rows written\n"; $workbook->close(); close $fh or die "Can't close '$csv' ; $!"; unlink $csv or warn $! ; } }

    Update : Why are you using this ?

    my $xlsx_WSD = ( "C:\\Temp\\Data\\data.xlsx"),, 1;

    instead of simply

    my $xlsx_WSD = "C:\\Temp\\Data\\data.xlsx";
    poj
      I use this line.. my $xlsx_WSD = ( "C:\\Temp\\Data\\data.xlsx"),, 1; because in windows the file can be read if it is open.
        I use this line.. my $xlsx_WSD = ( "C:\\Temp\\Data\\data.xlsx"),, 1; because in windows the file can be read if it is open.

        Well, the ,, 1 from above is gratuitous:

        qwurx [shmem] ~> perl -MO=Deparse -e '$xlsx_WSD = ( "C:\\Temp\\Data\\d +ata.xlsx"),, 1;' $xlsx_WSD = 'C:\\Temp\\Data\\data.xlsx', '???'; -e syntax OK

        See the '???' ? This is what's optimized away: the commas, and the 1.

        Then, of course the file can be read if it is open - but only if it is opened for reading, right?

        perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
        in windows the file can be read if it is open.

        As I said here, that maybe true for Excel using Visual Basic for Applications but not in Perl.

        poj
Re: Scripts work when run individually but not when run as one compiled script
by GotToBTru (Prior) on Jan 19, 2016 at 22:00 UTC

    Not sure if this is an actual problem, but it is an inconsistency in your logic:

    ... my $xlsx = $fp =~ s\csv\xlsx\; ... my $workbook = Excel::Writer::XLSX->new("$fp");

    Should this not actually be

    ... my $xlsx = $fp; $xlsx =~ s/csv/xlsx/; ... my $workbook = Excel::Writer::XLSX->new("$xlsx");

    In the s/// they do need to be forward slashes, not backslashes, which have a different meaning.

    Updated to reflect better information in discussion following.

    But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

      In the s/// they do need to be forward slashes, not backslashes, which have a different meaning.

      Of course the canonical form is s/pattern/replacement/, but the delimiters can be any other character except whitespace, even control characters(but not ^L or ^M := \r, ^J := \n which count as whitespace, too; maybe there are other exceptions), as well as balanced parens, braces and brackets:

      qwurx [shmem] ~> perl -le '$_="foe"; s\oe\riend\; print' friend qwurx [shmem] ~> perl -le '$_="foe"; s^Toe^Triend^T; print' friend qwurx [shmem] ~> perl -le '$_="foe"; s^Woe^Wriend^W; print' friend # even this works qwurx [shmem] ~> perl -le '$_="foe"; s$oe$riend$; print' friend

      Using a meta-character or a sigil as delimiter makes it unusable as such in the regex, which is why we generally don't do that.

      Note that e.g. the ^T is the terminal's representation of "\ct" entered via <Ctrl>-v<Ctrl>-t in the shell on a Linux box.

      perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
        but the delimiters can be any other character except whitespace
        You probably meant it implicitly, but delimiters cannot be word characters (or alphanumerical characters).