Wow, thank you so much, that is almost perfect. After uncommenting the initial OPEN line and setting $twig->parse to use $IN rather than \*DATA, I have noticed only three things I can't explain. The first is that if I add a print statement to print the value of $hr it appears to print the hour exactly matching the hour field in the string, yet in the comparison it appears to be using the GMT equivalent. In other words, if the string shows timea as 20140623200000 -0400 then if I print $hr it shows 20 but if I am doing the comparison I have to specify 0 (four hours later) and the next day!

I can easily live with that, but there are a couple other bits of weirdness. One is that when the label_a line is written out, the three data elements are not in the original order. Where originally there was timea, timeb, and id, now it is id, timea, timeb. It appears to be putting the data elements in alphabetical order. While in theory that shouldn't be an issue, I'll need to do some experimentation to see whether it is or not.

The final thing is that it is writing a LOT of extra blank lines to the output file. The original file contains no blank lines except for one near the top of the file and one near the bottom, but it appears that whenever XML::Twig outputs a label_a block it adds a blank line at the beginning, and (here is what I REALLY can't understand) whenever it skips a label_a block it also leaves a blank line in the new file. Very strange and I really don't understand why it happens, but I will need to see if it makes any difference. If anyone can explain that to me, I would really like to know how to eliminate the excess blank lines.

But this is miles ahead of where I was last night and you have introduced me to a couple of handy Perl modules, so thank you very much, this was MUCH appreciated!


In reply to Re^2: How can I keep or discard certain blocks of an XML file based on first line of block? by Anonymous Monk
in thread How can I keep or discard certain blocks of an XML file based on first line of block? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.