in reply to Regex et XML

Idont want to use SAX or DOM to parse(Overkill)

So instead of using something that parses XML and that makes life easy for you, you'd rather spend time writing something that parses some format that looks like XML (see On XML parsing to see why you probably won't cover the entire spec)? Does this seem like sound software engineering practice to you?

Some very knowledgeable people indeed use regexps to parse XML. People who _really_ know what they are doing, why they are using regexps (they need the speed) and when to use them (the XML is in a known format). Before getting to that level, and needing the speed, I really think it's much safer , not to mention easier, to use a parser.

You would have to show us an example of the data if you want help here. XML::Simple might or might not be what you need BTW, its output is often quite different from the input.

This might help you, should you choose to go the overkill way ;--):

#!/usr/bin/perl -w use strict; use XML::Twig; use YAML; my %hash; XML::Twig->new( twig_handlers => { elt => sub { $hash{$_->field('key')}= $_->field('v +alue'); $_->insert_new_elt( first_child => ts + => scalar localtime); $_[0]->flush; # flushes the doc to us +e less memory (might be overkill ;--) } }, pretty_print => 'record_c', ) ->parse( \*DATA) ->flush; # to flush the closing tag for doc print "\n\nHASH:\n", Dump \%hash; __DATA__ <doc> <elt><key>key1</key><value>value1</value></elt> <elt><key>key2</key><value>value2</value></elt> <elt><key>key3</key><value>value3</value></elt> <elt><key>key4</key><value>value4</value></elt> </doc>

Replies are listed 'Best First'.
Re: Re: Regex et XML
by butlerdi (Initiate) on Feb 27, 2004 at 18:16 UTC
    Thanx for your response. The reason for wanting to use regex is that the XML being parsed is of no interest to the program (a message handler) I am merly passing this from one system to another (occasionally connected P2P devices). The format and content never change and the footprint and memory capabilities of the device are very limited. Even nanoxml is a bit heavy here. All I am really looking to do is to replace a value "Update" with the cutrrent time or in some cases a session id.