Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot


by TStanley (Canon)
on May 14, 2001 at 18:33 UTC ( [id://80212] : modulereview . print w/replies, xml ) Need Help??

Item Description: Read and separate character separated data

Review Synopsis:

This module is used for reading character separated data. At the
suggestion of it's creator, I used it in a project of mine,and I am
quite glad that I did so.

Here are the available methods within this module:

This is the constructor. It will take a hash of three arguments, all of which are optional. The file name, file handle, and the single character separator are the three arguments. If a filename is passed, and a file handle isn't, it will open a filehandle on that file and set the filehandle accordingly. Also, the separator value is a comma by default.

These are the set methods for the optional arguments for the new() constructor.

This method takes the name of a file and opens it. It will also set the filename
and file handle.

Takes an array of field names and memorizes the positions for later use.

bind_headers() Reads a row from the file as a header line, and memorizes the field positions for
later use. This method is preferred over the bind_fields method.

Reads row from file and returns an array or a reference to an array, depending
upon context. It also stores the row in the row property for later use.

Extracts a list of fields out of the last row read.

The Good
As tilly points out in the POD that accompanies the module, most people try to use
split to separate value separated lines, or they read the line and try to parse it. This
makes it impossible to handle returns that are embedded in a field.
This problem is solved by the creation of the xSV object with access to the filehandle, and
if in parsing, it notices that a new line is needed, it can read the file at will.

The Bad (and Ugly)
The module is very unforgiving concerning the character separator. It only works on a single
character as a separator. The overall speed isn't to bad, but like in all things, there is always
room for improvement.

The below section of code is taken from my program. This function
is what reads all of the questions into a hash reference, where I later pull
questions from it at random.

sub Loader{ my $Data = shift; my $file = shift; ## I declare the Text::xSV object my $XSV = Text::xSV->new(); my $question_number; my $length; my $f; my @Sorter=(); ## I set my separator here, otherwise it ## defaults to a comma. $XSV->set_sep("|"); open(FH,"$file")||die "Can't open $file: $!\n"; while(<FH>){ ## Here I get use the get_row method to retrieve ## the row, which also parses it. @Sorter = $XSV->get_row(); $question_number = shift @Sorter; $length = @Sorter; for($f = 0;$f < $length;$f++){ $Data{$question_number}[$f] = $Sorter[$f]; }#End of for loop }#End of while loop close FH; return $Data; }#End of sub
One thing I did notice is that if the lines were of different lengths (If one question was multiple choice, and the next was a true/false question), the module would spit out a warning to that effect, the information was also stored in the row property,and it noticed the difference. This, however, did not affect the overall performance of the program itself.