Re: Parsing XML into a Hash
by Ovid (Cardinal) on Nov 03, 2003 at 17:42 UTC
|
Why on earth would your client trust you to install your code that others have not seen or had the opportunity to evaluate, but will not trust code that many thousands have seen, trust, and use every day?
If possible, just bundle the XML::Simple code with your code and move on to a real project with clients who have realistic expectations. Since you are asking the question that you've asked, it's highly unlikely that you will be able to meet the XML::Simple functionality without a ridiculous cost overhead that the client would be foolish to ignore.
If there are reasonable objections to refusing XML::Simple, I'm all ears.
| [reply] |
Re: Parsing XML into a Hash
by Khansultant (Sexton) on Nov 03, 2003 at 17:35 UTC
|
Seems to me like your client is asking you to *recreate* XML::Simple!
When you say that $client does not allow, do you mean you can't install and/or use the external module? If so, would it be possible to simply copy the module and paste it into your code? | [reply] |
|
Yeah, reinventing the wheel (or in this case, XML::Simple).
Problem is, they get very antsy about anything outside being brought in here. It's not even really an option for me to copy the code over, unless I want to manually type each and every line by switching monitors back and forth (via KVM switch).
Yes, I know this sounds weird, but that's the way this shop works. I can try to see if I can convince them to install the external module, but that would require finding someone here I can talk to on a technical level who also has the authority to do something like that.
| [reply] |
|
I've had some experience acting as a liaison between the business side (read "non-tech") of a company and the IT side. I spent the better part of two years doing precisely that for a major online music equipment retailer.
IMHO, anybody with a touch of common sense ought to be able to present a proposal (like including XML::Simple) in a way that demonstrate both the time and financial savings it brings to the project as a whole. On top of that, pointing out the complexity of writing an XML parser on your own (simply saying "I can't do that" works!) is a plus.
In my experience, I found my best ally to be someone on the business side with some technical competency that had close ties to people in IT *and* the financial side of the company. Using them can benefit you in the sense that they can back your ideas from "inside" enemy lines.
In my case, it was the director of operations - she worked closely with the CFO & controller and had a husband who was a computer programmer. I typically would take my ideas to her and demonstrate how they would help her. She'd then take the idea to her husband overnight, then typically sponsor my proposal in meetings with the upper-ups as a cost-saving idea.
It's amazing how they'll say "no" to you, but when someone inside suggests it they are more open to the idea.
| [reply] |
Re: Parsing XML into a Hash
by jeffa (Bishop) on Nov 03, 2003 at 19:21 UTC
|
Here's how i imagine the conversation might occur if i were the consultant:
[Potential Client] We need you to write an
application that parses XML documents.
[jeffa] Sure, i'll just install XML::Simple and
...
[Potential Client] You can't do that.
[jeffa] Excuse me?
[Potential Client] You can't bring in any outside
software for this project. It's a security violation.
[jeffa] But i could spend months just to write a
parser. Do you have any idea the complexity involved in
writing a 100% robust parser? The amount of time it takes
just to test?
[Potential Client] You can't bring in any outside
software for this project. It's a security violation.
[jeffa] Well, it's been good talking to you. I wish you and your company the best of luck ... /runs like
hell
Money is good, but, as eduardo just told me today,
sometimes you have to pick and choose your clients.
However, for the amount of time you will spend writing a
good parser, maybe you should reinvent a wheel - as long
as you are getting paid by the hour. ;)
| [reply] |
|
It's not quite like that.
You see, I work for $consulting_company that has a contract with $client. I don't get to pick any fights here. That's up the Tech Lead (not me) and others. I just get to sit here and make executive decisions about the code like, "Well, since I am controlling all the input and output of this program, and I am controlling all the changed to the XML file, I'll just write a simple handler to accomodate what I know is going to be in the XML file," and be done with it.
Life is much easier when you get enough control over the code you write that you can make decisions like this. Now all I have to do the changes to the data back into the XML file. More like an overwrite than an addition to the file. Whee!!! FUN!
| [reply] |
|
Given the general consensus shown in the other replies, I'm probably putting up flame bait (or at least a downvote magnet), but here goes... You say:
I just get to sit here and make executive decisions about the code like, "Well, since I am controlling all the input and output of this program, and I am controlling all the changed to the XML file, I'll just write a simple handler to accomodate what I know is going to be in the XML file," and be done with it.
To which I say "Amen, Brother!" Based on the very tidy and fairly simple XML sample in your original post, I don't see a problem with writing a "tightly-bound" (i.e. ad-hoc) "parser" in a dozen or so lines of perl -- the point being to get the job done with minimal fuss (including, mainly, minimal fuss with the folks who are paying for this job). What this really means is that you just need to be very careful about testing the script that creates this XML stream, to make sure its output always meets the constraints assumed by the downstream "parser" script.
Assuming that you can manage the quality of the XML stream as it's being created, then something like the following would probably suit the bill for reading that stream:
open( XML, "source_of_xml.data" ) or die "I died 'cuz $!";
{
local $/ = "</item>\n";
my %item;
while (<XML>) { # read one whole <item>...</item> into $_
for my $tag (qw/name working uptime downtime/) {
($item{$tag}) = m{<$tag>(.*?)</$tag>}s; # (leave off "s",
# if tags are always fully contained on one line)
}
# now, do what you want with %item...
}
}
So what's wrong with that? If you really are creating the XML stream as well as processing it -- and if the data structure is really as flat as your example makes it out to be -- then you really don't need an XML parsing module.
In essence, you seem to be using XML simply as a means of "embellishing" (reformatting) a flat table, and there's no need for a hefty, C-compiled module to handle that. | [reply] [d/l] |
|
|
|
|
| [reply] |
|
|
Re: Parsing XML into a Hash
by hardburn (Abbot) on Nov 03, 2003 at 17:36 UTC
|
Could you bundle up a proper XML parser as part of your application? You wouldn't need to install it as a regular module, just have it sit alongside the rest of your program.
You do not want to do this with regexen. With Perl's extended regex system, it is possible to parse HTML/XML, but it's very ugly and probably quite a bit slower than the XS-based parser modules.
---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer
: () { :|:& };:
Note: All code is untested, unless otherwise stated
| [reply] [d/l] |
|
Would love to, if I knew how.
I'm new to XML, so I know little about things like parsing it out and such.
| [reply] |
|
Start with the assumption that it is very non-trivial. You've got to simultaneously accomodate the right character set, quoting, conditional escaping, balancing, um, other ugly stuff I can't think of just now. That is, to be a proper XML parser anyway. Its worth the time to get the right software installed instead of faking it.
| [reply] |
Re: Parsing XML into a Hash
by ysth (Canon) on Nov 03, 2003 at 20:09 UTC
|
Hmm. XML::Simple is 1600 lines (up to the __END__ mark,
anyway). That says to me that this is a non-trivial problem in the general case. If you really are stuck
with wheel-rewriting, you need to think about what
limitations (including doing different things in your
code based solely on tag name or whitespace) you can impose on your particular format to make life easier for you. Ideally these would be documented up front. Then you can plan an algorithm and write the code.
Two other thoughts: can you email the XML/Simple.pm file
to someone to demonstrate the non-triviality here?
And: even XML::Simple will have a hard time if it really
has to match <downdime> with </downtime> :) | [reply] |
Re: Parsing XML into a Hash
by BrowserUk (Patriarch) on Nov 04, 2003 at 03:31 UTC
|
One thing that struck me. If your not allowed to bring in external source, then what are you going to do if someone here offers you a solution? Re-type it, line-by-line, swicthing screens as described below?
If that's the case, then I would strongly recommend you go look at XML::Parser::Lite and borrow a copy typists:)
It's a pure perl module, runs on regexes, that emulates the working style of the 'real' XML::Parser. It's less than 150 lines, 200 if you include the pod -- don't go omitting the author copyright will you?. Be warned! It contains some long and hairy regexes that will be easy to screw up whilst copying.
Personally, I would ask management if they'll accept a pure perl module from outside. It's hard to see how they would be able to justify rejecting a pure perl module. It is literally no different, no greater or lesser risk, than if you had typed it in from your head.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!
| [reply] |
Re: Parsing XML into a Hash
by BUU (Prior) on Nov 03, 2003 at 19:27 UTC
|
Just to second a few other responses: (and brutally steal from jeffa =] )
client: I need you to write an xml parser that parses it into a hash.
buu: Sure, I'll just install xml::simple
client: You can't do that
buu: Sure thing, I'll just need an additional 6months to write a secure parser. You don't mind the project taking 6 months longer and costing that much more? Oh good. | [reply] |
|
Would that it could be that way, but it isn't. *sigh*
However, I do get to make decisions about how it's being worked the 'long way'. And if they come back in a week or so and say "we want to make this change", I get to tell them, "Well, I could do that, but you see, I had to do this to accomodate the last thing you changed. This will push the timeframe back (2*x)2", where "x" is the time it will acutally take me to do it, and the expression is the "estimated" cost to the timeline in double "x" in the next increment in time. Therefore, if it would take me 2 hours, I would "estimate" the impact on the work schedule at 4 days.
Feel free to use this formula anywhere useful. Yes, it's a "ShareWare Contractor Estimator". :)
| [reply] |
|
Wow, that's even more conservitive than the estimator I usually use (that being the Scotty Method of doubling your estimates, knowing that the captain will always cut off a third anyway).
Better life through Star Trek!
---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer
: () { :|:& };:
Note: All code is untested, unless otherwise stated
| [reply] [d/l] |
Re: Parsing XML into a Hash
by jZed (Prior) on Nov 03, 2003 at 23:00 UTC
|
You didn't mention if XML::Parser is available. If it's available and XML::Simple isn't, you'll only need to reinvent a few spokes rather than the whole wheel. :-) | [reply] |
Re: Parsing XML into a Hash
by signal9 (Pilgrim) on Nov 03, 2003 at 23:28 UTC
|
How is it you ended up talking about How you were going to do the job anyway? I don't think it's a generally good idea to talk about code libraries with non-tech people. Good thing you aren't using C. Imagine having to write your own standard library. | [reply] |
Re: Parsing XML into a Hash
by maverick (Curate) on Nov 04, 2003 at 17:18 UTC
|
Here's what I suggest that you do.
use strict;
use warnings;
use Clue::ByFour;
use Existing::Client;
my $clueX4 = new Clue::ByFour;
my $client = new Existing::Client;
$clueX4->add("rusty nail");
$clueX4->target($client);
for (0..1000) {
$clueX4->swing();
}
Seriously. You need to at least make an attempt to enlighten the client about the Right Way (TM) to do this. You may THINK you have control over the format of the file, but some day down the line (sooner rather than later) and you or whoever works on this after you is in for a world of hurt. Security people are NOTORIOUSLY stupid about this sort of thing...they have policies that they have sat around and dreamed up and never even really considered the intelligence of.
You should have enough pride in your work to be SERIOUSLY po-ed about being asked to write shoddy 2nd rate code because of PHB's in the first place.
| [reply] [d/l] |