Good Day Honorable Monks,
I am attempting to use RegEx to match specific XML tags and can't seem to figure out why I can't get the results that I expect.
Here is an example xml doc:
<ROOT hostname="bumblebee" tstamp="2011/09/21 22:24:05">
<APPLICATION>
<PORT>7777</PORT>
<APP_HOME>/extra/localcw/opt/APP/sun4</APP_HOME>
<VERSION>V36.11.01</VERSION>
<PERF_HOME>/usr/localcw/opt/APP/Solaris-2-9-sparc-64</PERF_HOM
+E>
<PERF_VERSION>glanceSunOS 5.9 (Solaris 9) (sparc, 64 Bit) 7.3.
+00.6059 Jul 19 2006</PERF_VERSION>
<STAR_VERSION>3.0</STAR_VERSION>
<DEFAULT_ACCT>root</DEFAULT_ACCT>
<HISTORY_RETENTION>90</HISTORY_RETENTION>
<LAST_FILE_DOWN>StAR-201105090928.tar</LAST_FILE_DOWN>
<LAST_STATUS>No download file found</LAST_STATUS>
<ACL>
<ACCOUNT id="f9a64ef61c">
<MD5>f9a64ef61c</MD5>
<USERNAME>*</USERNAME>
<HOST>flower</HOST>
<PERMISSION>P</PERMISSION>
</ACCOUNT>
</ACL>
</APPLICATION>
</ROOT>
So I have basically 5 different distinct XML tag formats to match against.
<openingTagName attribute="whatever">
<openingTagName>
<openingTagName>value</closingTagName>
<openingTagName></closingTagName>
</closingTagName>
But when trying to match against <openingTagName> only using something like the following it doesn't match what I would think it would namely <openingTagName> only like from the xml tags <APPLICATION> and <ACL>. But it doesn't. Can someone give me a hint or two on how to grab only <APPLICATION> and <ACL> which once realized should help me get through the others myself.
foreach my $line (@xml)
{
chomp($line);
if($line =~ /^\s*<(\w+)>[^\w*|\d*|<]/)
{
print "$1\n";
}
}
I would have thought the negate classes would have eliminated any and all values after the initial > but it doesn't. I have spent quite a while on different patterns but none work. Any and all suggestions will be greatly appreciated.
Thanks...
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.