Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: best way to change xml record using XML::Simple?

by CountZero (Bishop)
on Feb 24, 2003 at 19:40 UTC ( [id://238212]=note: print w/replies, xml ) Need Help??


in reply to best way to change xml record using XML::Simple?

XML is of course very modern, hip and funky to boot, but I wonder if it is not a bit of an overkill here.

As your data seems to be very regular, I would go either for a simple CSV-file (easy to extract from an excel-file) and DBD::CSV or go direct to the excel-file itself (if it is in an acceptable format, i.e. worksheet = TABLE and the first row contains the columnheadings) and use DBD::Excel.

CountZero

"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

  • Comment on Re: best way to change xml record using XML::Simple?

Replies are listed 'Best First'.
Re: Re: best way to change xml record using XML::Simple?
by seattlejohn (Deacon) on Feb 24, 2003 at 20:33 UTC
    While I agree with some of your thoughts, I personally try to avoid CSV files because the definition of "CSV" is so elastic in practice. How do you quote or escape field containing commas and quotes? How do you handle trailing empty fields? I've seen a lot of variation in stuff that is ostensibly "CSV", and that always makes me nervous from a long-term maintainability and interoperability standpoint.

    That said, I do agree delimited files can make a lot of sense for lightweight data storage. For information that normally wouldn't contain internal whitespace other than "regular" (ASCII 32) spaces -- and this application might qualify -- I often choose tab-delimited.

    One potential benefit I do see to using XML in this application is that you can easily store data that isn't quite so regular. For example, if you wanted to support multiple contact phone numbers, it would be fairly easy to expand the data structure like this:

    <record> <!--existing stuff--> <contact_phone_number note="business">123-4567</contact_phone_number> <contact_phone_number note="pager">555-6789</contact_phone_number> <contact_phone_number note="vacation home in Bermuda">+1-99-20-55-6789 +</contact_phone_number> </record>

    Dealing with irregularities like that would be a bit more work in a rigid db-like or non-hierarchical file format.

            $perlmonks{seattlejohn} = 'John Clyman';

      I entirely agree with your comments and that is why DBD::CSV has all the usual parameters for setting the "elastic" properties of CSV.

      Of course XML cannot be beaten for storage of irregular records. Anyone would be hard pressed to equal such a flexible system for irregular data and still maintain a sense or order.

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      There's an anti-buzzword backlash against XML floating around, which is a natural reaction to something that's been hyped so much by marketers.

      But after all the smoke clears, it's pretty hard to beat the simplicity of using XML::Simple's XMLin() and XMLout() functions.

      CSV files combined with while (<IN>) {...} logic has old-school appeal, but in the end, it's irritating and tedious. There's so many exceptions to be handled, like data that splits over lines, or data that contains the delimiter as part of the data, etc, and it's not as flexible as XML when you have to add new variables.

        Don't get me wrong: There are lots and lots of things that are appealing about XML; I've used it numerous times to develop production apps, including tools that have to munge XML documents hundreds of thousands of lines long. And I love XML::Simple, because it makes using XML almost as easy as using native Perl data structures via something like Data::Dumper.

        At the same time, though, I think it can be worth pointing out that, like any technology, XML is not necessarily the ideal solution for every problem. XML doesn't permit random access to records (though neither does CSV), it tends to be verbose, and you have to be careful with your document structure. XML::Simple hides most of that tedium, but then the various CSV modules on CPAN also take care of a lot of that for CSV as well.

                $perlmonks{seattlejohn} = 'John Clyman';

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://238212]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-04-24 00:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found