Re: Create Perl script

I realize you're just trying to get your homework assignment turned in on time so you can get a passing grade in a course to keep your parents off your back, but let's have a closer look at the question. It's actually mildly interesting. I hope that you spend as much time thinking over and working on this problem as I am going to take in explaining a reasonable approach to tackling it. I have my doubts, but I should give you the benefit of the doubt.

First you need to parse an excel file, right? One of the simplest solutions for that is Spreadsheet::ParseExcel. That module will allow you to iterate over the rows of the spreadsheet, where each row is divided into an array representation of that row's columns. With that format it will be pretty easy for you to build up an array of arrays, or some other suitable data-structure.

The next problem is that you wish to be able to search your data structure based on ID, Name, or Address. You probably ought to decide somehow what search type will be most common. Maybe that's a name search. So if that's the case, you may as well build your data-structure as a hash of arrays. The hash keys would be student's names. The value would be an array containing ID and address. If you think an ID search would be more common, build it as a hash of arrays based on hash keys that represent ID's. That assumes ID's can come and go as students come and go, which would render an array indexed on ID's impractical.

Whatever you choose to be your hash key, that only solves one third of the search problem. You'll have to also build two more hashes of indexes. Let's say your primary data hash contains student names. Your first index hash would use ID's as keys, and your second index hash would contain addresses as keys. Ok, you don't have to do that, but it will make your searches much more efficient if you do.

So now we've figured out a data structure to use, and built indexes for O(1) searches. But you also asked how to deal with duplicates. It would be pretty lame for this imaginary database to have duplicated ID's. Can we assume that's not going to happen? If so, I guess you're just going to have to check for them as you're building up your ID index hash, and if they occur, let that index hash point to two elements in your primary database hash. Name duplicates should be rarer, but probably not impossible. For those, you would definitely need for your name-keyed hash to allow for more than one entry for a given key. They way to accomplish that is to push a list onto the hash element's value whenever a duplicate occurs. The same technique will work for your address index hash.

The description is much more complicated than the implementation. You'll end up with data that looks like this:

%name_keyed = (
    christine => [ 1, "New York" ],
    murray => [ 2, "Redmond" ],
    kumar => [ 3, "Sunnyvale" ],
);
[download]

And if 'christine' is duplicated....

%name_keyes = (
    christine => [
        [ 1, "New York" ],
        [ 4, "Salt Lake City" ]
    ],
    ....
);
[download]

Your index hashes shouldn't be too hard to figure out from there.

Oh, that leaves the question about GetOptions. I assume that means your instructor wants you to use the module Getopt::Long, or Getopt::Std, which is discussed in getopt. The documentation speaks for itself there. I would guess he would want you to set up some command line switches such as -n for name searches, -i for ID searches, and -a for address searches.

Now, let's be serious about this for a moment. If I were actually approaching this as a real problem (instead of some homework assignment for High School Comp. Sci.-I), I would assume that the data-set could become large. That being the case, I would probably pull the spreadsheet into a database. If it doesn't grow too huge, I would use DBI with DBD::SQLite. If it could really grow, I would use a heavier-duty database such as MySQL. I would build indexes on the searchable fields, and would build a nice simple command-line tool or a quick-n-dirty CGI interface as front-ends for the searching. Maybe you could provide a full fledged database solution as extra credit.

Dave

Comment on Re: Create Perl script Select or Download Code