in reply to Writing Spreadsheet parse to Database

It would help if you could give a more clear explanation of your task. In the meantime, let's clean up some code. Your first section of "for" loops are written C-style and can be cleaned up:
SHEET: { for my $sheet ( 0 .. ( $book->{SheetCount} - 1) ) { my $work_sheet = $book->{Worksheet}[$sheet]; print "--------- SHEET:", $work_sheet->{Name}, "\n"; next SHEET if ! defined $work_sheet->{ MaxRow }; ROW: { for my $row ( $work_sheet->{MinRow} .. $work_sheet->{MaxRo +w} ) { next ROW if ! defined $work_sheet->{MaxCol}; for my $column ( $work_sheet->{MinCol} .. $work_sheet- +>{MaxCol} ) { my $cell = $work_sheet->{Cells}[$row][$column]; print "( $row , $column ) =>", $cell->Value, "\n" +if($cell); } # next column } # next row } } # next sheet }

Note that all of the variable names have been made much more descriptive. Isn't the first line of the following easier to read than the second?

my $cell = $work_sheet->{Cells}[$row][$column]; $oWkC = $oWkS->{Cells}[$iR][$iC];

Which would you want to maintain?

From here, it's a simple matter of creating a data structure to hold your data instead of writing it to disk or a database. Here's a rough hack of some VERY untested code that should point you in the right direction.

my %book_data; SHEET: { for my $sheet ( 0 .. ( $book->{SheetCount} - 1) ) { $work_sheet = $book->{Worksheet}[$sheet]; my $sheet_name = $work_sheet->{Name}; $book_data{ $sheet_name } = []; next SHEET if ! defined $work_sheet->{ MaxRow }; ROW: { for my $row ( $work_sheet->{MinRow} .. $work_sheet->{MaxRo +w} ) { next ROW if ! defined $work_sheet->{MaxCol}; my @vals; for my $column ( $work_sheet->{MinCol} .. $work_sheet- +>{MaxCol} ) { my $cell = $work_sheet->{Cells}[$row][$column]; push @vals, $cell; } # next column $book_data{ $sheet_name }[ $row ] = \@vals; } # next row } } # next sheet }

As I mentioned, this code is not "cut-n-paste", as it's untested (and I'm pretty sure that the final %book_data assignment is not going to match your needs, but hopefully, it's a good start. Once you have the data structure in place, it's easy to pass around and access (assuming you don't have too much data and thus have memory problems).

Side note: I don't like Hungarian notation (stuff like $oWkC or $iR where the data type is coded into the variable name): you are creating a maintenance obstacle. What if you define a particular variable as an int but at some point in the future, you need to change it to a float? If you reference said variable 200 times, it can be a pain to change all references to it. Of course, in well designed systems, these issues tend to be minimized, I routinely work on systems where previous programs have a library with variables like $main::somevar. Because they had the bad habit of using a lot of globals, I often have to keep several programs in synch when I change something. Guess what happens if they were to use Hungarian notation? It would simply magnify the problem. Some poor fool who comes behind and has to maintain the code base doesn't realize that $intSomeVar is now a str. Also, Perl's typing is based on scalars, arrays, and hashes, not int, char, or float.

Okay, enough rambling. Back to work :)

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
(tye)Re: Writing Spreadsheet parse to Database
by tye (Sage) on May 01, 2001 at 22:02 UTC

    Regarding Hungarian notation: This is Perl; variables aren't integers nor floats, they are scalars. Reading the original code, I see "o" being used as a prefix for "object" and see "i" being used as a prefix. In Perl, I wouldn't think "integer" when I see "i". In this code I would think "index", which is how it is being used. I can't say for sure what the original author meant "i" to mean, however. (Nor can I fathom how you would end up changing that to a float.)

    I find prefixing variable names with type information can be very useful but I don't mean type as in "int" vs. "short" vs. "double". I mean "type" as in "count of items", "index into list", "input-only", "output-only", "input and output", etc.

    Some people I work with actually suggested that, if you have a comment describing a variable, consider just changing the name of the variable to that entire comment. I think they forget that even English has pronouns and putting avoiding all shorthand just makes things long and probably harder to understand rather than easier. Of course, using too vague of variable names is much more common problem than using too long of ones. (:

    Taking the particular case of "o", which I feel confident in guessing the meaning of, if we change our implementation such that the variable is no longer an object, then we have to change a lot more than the variable's name. There might be a few places that we could avoid changing if the name didn't require changing....

    ...then again, what I find a much worse maintenance problem is variable names that are a textual match to other things. I hate going through a program looking for all uses of the variable "column" and finding lots of comments about columns. At least for Perl scalars, $column is less of a problem (for aggregate types, I prefer to be able to search for just the variable name rather than /[@\$]column/).

    OK, that wandered all over the place. I'll just stop and leave that as-is in hopes that someone will find something useful in it. Sorry for the mess. /:

            - tye (but my friends call me "Tye")