in reply to Regex problem while parsing tagged, hierarchical data
having said that, you could approach your regex from the other end and work backwards. I think this does what you want.
I feel dirty just posting that though, as it's so flaky a single white space would break it. Think of it as an example as to why you should fix your file format :)while ( $content =~ s/(.*<level1 id="([^"]*).*?)<level2>/$1<level2 id= +\"$2\">/gsi ) { }
UPDATE
I couldn't shake the feeling inflicted on myself by the above post. I seek to redeem myself with a full xml version :)
#!/usr/bin/perl use XML::DOM; use warnings; use strict; my $xml = q|<root> <level1 id="L1_0001"> <level2> <level3> <level4/> </level3> <level3> <level4/> </level3> </level2> <level2/> </level1> <level1 id="L1_0002"> </level1> <level1 id="L1_0003"> <level2> <level3/> </level2> </level1> </root>|; my $parser = new XML::DOM::Parser; my $doc = $parser->parse($xml); foreach my $l1_node ($doc->getElementsByTagName ('level1') ) { my $current_id = $l1_node->getAttribute('id'); foreach my $l2_node ($l1_node->getElementsByTagName ('level2') ) { $l2_node->setAttribute('id', $current_id); } } print $doc->toString; exit();
|
|---|