Re: writing after parsing

Hello Ashwitha,

Answer to your question!

I do not know if you still follow your question, but I decided to spend some time and become more familiar with *.csv files and also answer to your question.

Warning

My proposed solution is modified according to the output that I see from my test *.xls files that I created. I hope that is exactly what you require, otherwise the structure is the same you need to check your syntax and modify it accordingly.

Structure

I have found a dynamic solution to your question, I hope that this is what you are looking for. By dynamic I mean that the input in not restricted to 2 *.xls files that you requested but it handle nth *.xls files as an input. I think this is better since you might need 2 today but tomorrow maybe you need 10 etc.

The majority of the coding in my script was taken from Spreadsheet::ParseExcel where you can more information about it, but also the credit should go to the author Douglas Wilson.

So the answer to your question is this sample of code:

Update: (11/08/2014)

Based on the improved corrections of Anonymous Monk (see comments bellow for more information).

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use List::Compare;
use Spreadsheet::ParseExcel;

my @array = ();

sub check {

    foreach my $arg (@_) {

    # Check each $ARGV input for xls and csv extensions
    my $suffix = ( split /\./, $arg )[-1];

    if ($suffix =~ /xls$/ || $suffix =~ /csv$/) {
        # For demonstration purposes
        # print "I found it ".$_."!\n";
        next;
    }
    elsif ($suffix !~ /xls$/ || $suffix !~ /csv$/) {
        print "\nPlease enter two type inputs '*.xls' to read from and
+ '*csv' to write at.\n";
        print "Any other file format (e.g. '*.txt') is not acceptable.
+\n\n";
        exit();
    }
    } # End of foreach(@ARGV)

    return(@_);
}

my @arguments = check(@ARGV);

sub extra {

    foreach my $file (@_) {

    # Check each $ARGV input for the csv extension
    my $suffix = ( split /\./, $file )[-1];

    if ($suffix =~ /csv$/) {
        # For demonstration purposes
        # print "I found it: ".$file."!\n";

        my @result = sub_write($file);

        return @result;
    }

    my $parser   = Spreadsheet::ParseExcel->new();
    my $workbook = $parser->parse($file);

    if ( !defined $workbook ) {
        die $parser->error(), ".\n";
    }

    foreach my $worksheet ( $workbook->worksheets() ) {

        my ( $row_min, $row_max ) = $worksheet->row_range();
        my ( $col_min, $col_max ) = $worksheet->col_range();

        my ($final,$value);
        foreach my $row ( $row_min .. $row_max ) {
        foreach my $col ( $col_min .. $col_max ) {

            my $cell = $worksheet->get_cell( $row, $col );
            next unless $cell;

            $final .= "," . $cell->value();
=test
                    # Based on the CPAN tutorial output: Spreadsheet::
+ParseExcel
                    print "This is the \$final: ".$final."\n";
            print "Row, Col    = ($row, $col)\n";
            print "Value       = ", $cell->value(),       "\n";
            print "Unformatted = ", $cell->unformatted(), "\n";
            print "\n";
=cut
        }
        # remove the first character of the string ($final)
        if (defined $final) {
            substr($final, 0, 1) = "";
            # remove all the not necessary following characters ($fina
+l)
            chop($final);
            # print "This is the \$final before: ".$final."\n";
            push(@array,$final);
            $final = ();
        }
        else { 
            next;
        }
        }
    } # end of Worksheets
    } # End of foreach (@_)
    return 0;
} # End of sub (extra)

sub sub_write {

    my $csv = shift;

    open(my $write , ">" , $csv)
    or die "Can not open ".$csv.": $!\n"; # > write mode

    foreach my $line (@array) {
    print $write $line . "\n";
    }

    close ($write)
    or die ("Could not close: ".$csv." - $!\n");

    return (@array,$csv);

}

my @output = extra(@arguments);

print Dumper(\@output);
[download]

The code is able to read all sheets from each *.xls file, but it can not put them back to the same position. I have not looked in to yet, because I do not want to spend more time on it. I do not know if you can solve it in 5 minutes or if it is extremely difficult. I leave this up to you. I noticed though that when I am exporting an *.xls file with multiple sheets (e.g. more than 2) that the *.csvfile only contains the sheet that you are exporting not all!!!!! So I am not sure how to do it or hot script it to it. I am sure there is a way to put together all sheets again, but as I said, I do not want to spend time right now.

I have chosen to use a specific output for the *.cvs file the same that I get if I apply save as from an excel file.

Sample of an excel file in .cvs format.

Item_1,Cost_1,Sold_1,Profit_1
Keyboard_1,$10.00,$16.00,$6.00
Monitor_1,$80.00,$120.00,$40.00
Mouse_1,$5.00,$7.00,$2.00
,,Total_1,$48.00
[download]

I have tested the code, that my *.cvs output file can be regenerated in excel file, but also I found a script that can make the transformation for me.

I am not posting the script here, since it is not relevant with the question.

Sample of test sample.xls file that I tested my code.

Item_1,Cost_1,Sold_1,Profit_1
Keyboard_1,$10.00,$16.00,$6.00
Monitor_1,$80.00,$120.00,$40.00
Mouse_1,$5.00,$7.00,$2.00
,,Total_1,$48.00
[download]

Sample of test test.xls file, this one contains two sheets numbered as sheet 1 and sheet 2. I can not demonstrate here it looks like just import them in you excel file as two separate file, or copy and paste them to test your code.

Item_2,Cost_2,Sold_2,Profit_2
Keyboard_2,$10.00,$16.00,$6.00
Monitor_2,$80.00,$120.00,$40.00
Mouse_2,$5.00,$7.00,$2.00
,,Total_2,$48.00

Item_3,Cost_3,Sold_3,Profit_3
Keyboard_3,$10.00,$16.00,$6.00
Monitor_3,$80.00,$120.00,$40.00
Mouse_3,$5.00,$7.00,$2.00
,,Total_3,$48.00
[download]

The output of the code through this particular files:

Item_1,Cost_1,Sold_1,Profit_1
Keyboard_1,$10.00,$16.00,$6.00
Monitor_1,$80.00,$120.00,$40.00
Mouse_1,$5.00,$7.00,$2.00
,,Total_1,$48.00
Item_2,Cost_2,Sold_2,Profit_2
Keyboard_2,$10.00,$16.00,$6.00
Monitor_2,$80.00,$120.00,$40.00
Mouse_2,$5.00,$7.00,$2.00
,,Total_2,$48.00
Item_3,Cost_3,Sold_3,Profit_3
Keyboard_3,$10.00,$16.00,$6.00
Monitor_3,$80.00,$120.00,$40.00
Mouse_3,$5.00,$7.00,$2.00
,,Total_3,$48.00
[download]

Sorry for the long answer and post but I think it was necessary to explain everything. In case you have some problems to understand how the code operates do not hesitate to ask me. Although I would strong suggest to you to go through the CPAN link that I provide you earlier it has a lot of nice information, since you are planning to work with *.xls files I would advice you to read it again and again.

Hope this solves your question or at least to come as close as possible to your solution.

Seeking for Perl wisdom...on the process...not there...yet!

Comment on Re: writing after parsing Select or Download Code

Replies are listed 'Best First'.
Re^2: writing after parsing by Anonymous Monk on Aug 08, 2014 at 14:10 UTC
Hello thanos1983, since you've indicated that you're still learning, here are a few more suggestions: Although it might work, I'd again suggest not to use `$_` as much as you are. For example, inside your `sub extra`, you're using it for at least two different things; it'd be much easier to keep track of this if you used named lexical variables instead, e.g. `my $filename` and `my $line`. `$_` is fine for short pieces of code, but using it over long spans can cause problems later on when you or someone else attempts to read and modify the code. Style: Intermixing `sub` declarations with code makes the program flow a little hard to follow. Usually, one would put all the `sub`s at the bottom of the file, with all the other code above. Design: Perhaps the program structure might be more clear if you split `extra` into two `sub`s, one for reading the XLS and one for writing the CSV? Also, writing CSV is much better with Text::CSV. Style: Proper indentation helps the reader. See also perltidy. Keep up the good work learning, and the effort you are putting into your responses is admirable. (Just please make sure they stay correct, and/or are marked as "I'm still learning", so other people learning can benefit from them as much as you are.)	[reply] [d/l] [select]
Re^3: writing after parsing by thanos1983 (Parson) on Aug 10, 2014 at 12:43 UTC
Hello Anonymous Monk, That is true I am over using the parameters ($_,@_) because I found them so useful and to be 100% honest I thought they operate faster than defining a new parameter (e.g. my $line). But as I said I am still learning and to be honest I am trying to reach and provide comments in most questions that I can. The reasons are two, firstly people more experienced than me can also comment on top of my comments so I can learn from them, secondly I think that more experienced people tend to forget how was it back when "they where new to programming" and usually provide a straight answer to the problem. I found my self feeling a bit wrong about it because apart from a solution to the problem it would be nice the user to actually now and understand why this is happening and why I should write my code like this. Well to be even more honest I am only coding almost 2 years, and I have been working with a variety of programming languages so I am not specialized in all of them. When I start working more with Perl I got impressed with the abilities. So now I am learning and trying to get my hands "dirty" as much as possible to be able to assist people with my long comments and answers. :D I took in consideration your comments from the question but when I finished this code was 2 days before. I was looking forward to improve the user input using see question-2. I am still working with it trying to pass it through an OO process. So I did not spend time to modify this code. I wanted to provide a sample of solution to the user so he can start working with it, because I have a few ideas to push the code more before stop working with it. But because it might take more time I did not modify it. Any way I have already wrote too much, thank you again for your assistance to my questions I hope that I will get better soon. ;) I am recent graduate and looking forward to get a job in the telecommunication where I will get new challenges to learn and modify my skills even further. Seeking for Perl wisdom...on the process...not there...yet!	[reply] [d/l]
Re^4: writing after parsing by Anonymous Monk on Aug 10, 2014 at 17:36 UTC
Yes, writing a lot of (well documented) code and continually improving it is a great way to learn! I am over using the parameters ($_,@_) because ... I thought they operate faster My suggestion would be: don't worry about speed until it becomes an issue. Saving a few milliseconds execution time is only sometimes important, but a lot of the time code readability and maintainability is more important. When writing, one should of course take care not to brute-force everything (like for example knowing to use a hash instead of grepping over an array where appropriate, or not reading a second input file once for every line of the main input, like two recent posts here), but as they say, premature optimization is the root of all evil. If the execution speed of a program does become an issue, then one can profile the code and optimize the places where it's actually slow. I am recent graduate and looking forward to get a job in the telecommunication http://jobs.perl.org ;-)	[reply]