in reply to Regex for XML attributes...

First off, you XML is not well-formed - don't forget to place quotes around the level values as well.

Second off, use XML::Simple instead . . .

use strict; use XML::Simple; local $/; # just so i can slurp __DATA__ my $old_xml = XMLin(<DATA>,forcearray=>1); my $new_xml; foreach my $heading (@{ $old_xml->{'heading'} }) { my $level = delete $heading->{'level'}; $new_xml->{"heading$level"} = $heading; } print XMLout($new_xml); __DATA__ <xml> <heading level="2">Introduction to Arguments</heading> <heading level="3"> <index primary-key="procedures" secondary-key="definition, rest argume +nts in"/> <index primary-key="rest arguments" secondary-key="specifying, in proc +edure definition"/> Specifying Rest Arguments in a Procedure Definition </heading> </xml>
The idea is to turn the XML into an anonymous data structure and mangle that instead - all you have to do is iterate thru the 'heading' hashes and remove the level attribute/key - then you create a new data structure whose keys are the concatenation of 'heading' with the level value.

This works, but i only tested it on the XML that i provided, your milleage may vary. ;)

output: <opt> <heading2>Introduction to Arguments</heading2> <heading3> Specifying Rest Arguments in a Procedure Definition <index primary-key="procedures" secondary-key="definition, rest argume +nts in" /> <index primary-key="rest arguments" secondary-key="spec +ifying, in procedure definition" /> </heading3> </opt>

jeffa

    A flute with no holes is not a flute . . .
a doughnut with no holes is a danish.
                                - Basho,
                                  famous philosopher

Replies are listed 'Best First'.
Re: (jeffa) Re: Regex for XML attributes...
by tshabet (Beadle) on Aug 23, 2001 at 23:54 UTC
    When you say that it isn't well formed XML, are you referring only to the unquoted 2 in the tag, or something else? Anyway, this is a fantastic solution, and I will implement it right away...talk about service ;-P
    OK, so certainly you've illuminated the best way to do it, but....just for my own learning....What's failing with the regex? Anybody? Anybody? Beuller?
    Thanks jeffa, you're a life saver!
      Yes, the unquoted 2 and 3 level attributes need to be quoted, else XML::Simple complains:
      not well-formed (invalid token) at line . . .
      Also, i really don't think this will do what you think:
      <heading> level="3", blah blah blah </heading>
      You really should change that to:
      <heading level="3">blah blah blah</heading>
      You are welcome for the solution! . . . i would like to see a regex that solved the problem as well, but i imagine it would be an unruly beast . . . anybody? beuller? japhy?

      Hmmmm, on second thought - this is something probably best not done!

        Heh heh, I should explain myself further..... I was using "XML" loosely since, as you point out, this is not a valid XML syntax. My example code is actually the output from a script I wrote for another application, which is the transfer of a language specification written in another language into XML. The original language has the ability to give several attributes to its tags, such that the ultimate original for my examples would have been something on the order of
        {heading bob=foo, super=duper, level=3, blah blah blah}
        which my original script converts to the
        <heading> bob=foo, super=duper, level=3, blah blah blah </heading>
        format. Now I'm tacking on some code to handle these attributes (which don't occur in the syntax I originally wrote the script for) so that I have
        <heading bob=foo super=duper level=3>blah blah blah </heading>
        So I was thinking that I would save myself some trouble by using a regex. Anyway, the example code is the product of my script thus far and was not supposed to be well formed quite yet, which I should have said in the first place. :-) </CODE>