If your data is a tag soup, you won't be able to use XML or SGML tools. Actually, SGML tools are pretty dangerous in that case, because they might try harder to cope with your data and infer missing tags for example, which might not be what you want at all. Plus you would need a DTD, which you don't mention to have.

In you case your best bet is a simplistic regexp base tool. I'd try something like this one-liner:

perl -p -e's{<(.)}{ $ln= $level ? "\n" : ""; if( $1 eq "/") { $level--; } ; $indent= "  " x $level; if( $1 ne "/") { $level ++ }; $ln . $indent . $&; }eg' <filename>

It will work only if you don't have "compact" tags (<foo/>. And no CDATA section or any oddity of that sort. But it should be OK otherwise. As I said, you don't have XML data, so you don't get to play with all the cool XML toys ;--(


In reply to Re: Beautifying some SGML? by mirod
in thread Beautifying some SGML? by markww

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.