Well, parsing these nasty text files is quite difficult. So, it would be nice to make simple things easier. It's not a limitation of XML::Writer, it's more a limitation of my input (and my parser).

Most of the important pieces of the parsed news story wind up stored in keys of a hash -- headline, byline, pubdate, etc. These are plain, unmarked text fields, which makes XML::Writer a simple tool for generating the XML file, tags and all.

There's also a hash element which stores the text of the story. I use URI::Find and Email::Find to find and set links in this text field. My parser also has to add tags to denote other important areas of the text: sub-headlines, context graphs, etc.

So my story text field winds up with a few embedded tags. When it comes time to print the paragraphs of my text to the XML file, XML::Writer escapes the gt/lt characters in my tags. I'm sure I could chunk through my text paragraphs and look for these tags -- then use XML::Writer to toss them in (if it's a valid NITF tag). But, I'm on a silly deadline to get this thing done, and I have no help (other than the monks!).

So, I'd like to trust that the generated tags in my story text are valid and just toss them into the XML file without using XML::Writer's interface. It's a kludge.

In reply to Re: Re: Using XML::Writer to create NITF files, but some tags exist in my data. by joealba
in thread Using XML::Writer to create NITF files, but some tags exist in my data. by joealba

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.