Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Comparison of the parsing features of CSV (and xSV) modules

by dragonchild (Archbishop)
on Jun 14, 2004 at 16:41 UTC ( [id://366605]=note: print w/replies, xml ) Need Help??


in reply to Comparison of the parsing features of CSV (and xSV) modules

You missed a few features of Text::xSV
  • It grabs lines from the file based on $/. If you set that differently, it will set the record separator differently.
  • It will allow you to specify whether or not a record has the correct number of fields.
  • It will allow you to retrieve the data based on the column headings in a hash, instead of always in an array
  • It will allow you to write based on a hash, given the column headings, instead of always in an array

Adding a few features shouldn't be difficult. Specifically:

  • Reject newlines
  • user-defined delimiter
  • user-defined escape
  • forced-delimiting

tilly?

------
We are the carpenters and bricklayers of the Information Age.

Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

I shouldn't have to say this, but any code, unless otherwise stated, is untested

  • Comment on Re: Comparison of the parsing features of CSV (and xSV) modules

Replies are listed 'Best First'.
Re^2: Comparison of the parsing features of CSV (and xSV) modules
by jZed (Prior) on Jun 14, 2004 at 17:02 UTC
    You missed a few features of Text::xSV
    It wasn't my intention to make a full comparison of the features, only of the CSV parsing. Like Text::xSV, AnyData and the DBDs also have many options for handling the resulting data structures, it would take a much lengthier meditation to compare all of those.
    Adding a few features shouldn't be difficult.
    Those features may or may not be worthwhile in Text::xSV. I didn't intend the minus signs in the chart to necessarily indicate something the modules should support. Then again if Tilly does add the user-defined features, I can use Text::xSV as a backend for DBD::AnyData so that there would be a DBI/SQL interface to his excellent module.
      The problem is not adding the features, it is figuring out the API for offering them, and seeing the need.

      Which of the above user-defined features does DBD::AnyData need? All of them? Some? We probably should offline a discussion of how to provide them in a sane manner.

      Some are already present in some form. For instance Text::xSV offers the ability to pre-filter input as it comes in. There is nothing to stop such a filter from saying, "These characters are not allowed." That won't work for eliminating embedded tabs in tab-delimited data. But it would for NUL characters. And if you want to disallow tabs embedded in a tab-delimited file, you could just wrap Text::xSV with the necessary validation logic.

      Also as dragonchild points out, it reads based on $/ which can effectively allow you to change the record separator. Furthermore because I got tired of bug reports about invalid csv files, by default Text::xSV will treat either \n or \r\n as a newline, so files produced on Windows can be read on Unix and vice versa.

      Some of the other missing features can't presently be worked around. For me the barrier isn't adding them if I thought people would use them, it is figuring out what a reasonable API should be.
        T> Which of the above user-defined features does DBD::AnyData need? All of them? Some? We probably should offline a discussion of how to provide them in a sane manner.

        JZ> I'll think it over and msg you. I'm not sure it's worth your trouble if you don't have other uses for them. As I said, the minuses in the chart don't necessarily indicate a lack. Your module covers most of the formats people need, the other modules fill in the gaps for some odd formats that cover the edge cases.

        T> Some are already present in some form.

        JZ> I'll take a closer look and revise the chart.

        T> because I got tired of bug reports about invalid csv files, by default Text::xSV will treat either \n or \r\n as a newline, so files produced on Windows can be read on Unix and vice versa.

        JZ> I hear you. :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://366605]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2024-04-19 10:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found