Idont want to use SAX or DOM to parse(Overkill)

So instead of using something that parses XML and that makes life easy for you, you'd rather spend time writing something that parses some format that looks like XML (see On XML parsing to see why you probably won't cover the entire spec)? Does this seem like sound software engineering practice to you?

Some very knowledgeable people indeed use regexps to parse XML. People who _really_ know what they are doing, why they are using regexps (they need the speed) and when to use them (the XML is in a known format). Before getting to that level, and needing the speed, I really think it's much safer , not to mention easier, to use a parser.

You would have to show us an example of the data if you want help here. XML::Simple might or might not be what you need BTW, its output is often quite different from the input.

This might help you, should you choose to go the overkill way ;--):

#!/usr/bin/perl -w use strict; use XML::Twig; use YAML; my %hash; XML::Twig->new( twig_handlers => { elt => sub { $hash{$_->field('key')}= $_->field('v +alue'); $_->insert_new_elt( first_child => ts + => scalar localtime); $_[0]->flush; # flushes the doc to us +e less memory (might be overkill ;--) } }, pretty_print => 'record_c', ) ->parse( \*DATA) ->flush; # to flush the closing tag for doc print "\n\nHASH:\n", Dump \%hash; __DATA__ <doc> <elt><key>key1</key><value>value1</value></elt> <elt><key>key2</key><value>value2</value></elt> <elt><key>key3</key><value>value3</value></elt> <elt><key>key4</key><value>value4</value></elt> </doc>

In reply to Re: Regex et XML by mirod
in thread Regex et XML by butlerdi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.