perlsen has asked for the wisdom of the Perl Monks concerning the following question:

I have the following inputs in my text file. I need to convert the input file to HTML Format Table. if anyone suggest me for to get required output for the given inputs.

input text file *************** <h1>Heading Level 1 <h2>Heading Level 2 <h2>Heading Level 2 <h2>Heading Level 2 <h3>Heading Level 3 <h3>Heading Level 3 <h2>Heading Level 2 <h1>Heading Level 1 ........ ........ output html file ***************** <table><tr><td>Heading Level 1</td></tr> <table><tr><td>Heading Level 2</td></tr> <tr><td>Heading Level 2</td></tr> <tr><td>Heading Level 2</td></tr> <table><tr><td>Heading Level 3</td></tr> <tr><td>Heading Level 3</td></tr> </table> <tr><td>Heading Level 2</td></tr> </table> <tr><td>Heading Level 1</td></tr> ........ ........ </table>

thanks in advance.

Replies are listed 'Best First'.
Re: convert text to HTML Format Table.
by Anonymous Monk on Jan 27, 2005 at 11:51 UTC
    Assuming the text in $_:
    my $level = 0; s{<h(\d)>}{my $r = ""; if ($level < $1) {$r .= "<table>"} if ($level > $1) {$r .= "</table>"} $level = $1; "$r<tr><td>"}eg; s{\n}{</td></tr>\n}g; $_ .= "</table>" while $level--;
Re: convert text to HTML Format Table.
by gopalr (Priest) on Jan 28, 2005 at 05:21 UTC

    Hi Senthil,

    Pls. try the below.

    @Data=<DATA>; foreach $line(@Data) { if ($line=~s#(<h([0-9]+)>(?:.+?))#$1#) { $curr=$2; $endtag='</table>'; $line=~s#^(<h[0-9]+>)?(.+?$)#$1<tr><td>$2</td></tr>#; if ($prev=~m#^$#) { $line=~s#<h[0-9]+>#<table>#g; } elsif ($prev == $curr) { $line=~s#<h[0-9]+>##; } elsif ($prev > $curr) { $x=$prev-$curr; for ($i=0; $i<$x; $i++) { $line="$endtag\n".$line; } $line=~s#<h[0-9]+>##; } $line=~s#<h[0-9]+>#<table>#g; } $prev=$curr; $Final.=$line; } $Final=$Final.'</table>'; print $Final; __DATA__ <h1>Heading Level 1 <h2>Heading Level 2 <h2>Heading Level 2 <h2>Heading Level 2 <h3>Heading Level 3 <h3>Heading Level 3 <h2>Heading Level 2 <h1>Heading Level 1

    Output:

    <table><tr><td>Heading Level 1</td></tr> <table><tr><td>Heading Level 2</td></tr> <tr><td>Heading Level 2</td></tr> <tr><td>Heading Level 2</td></tr> <table><tr><td>Heading Level 3</td></tr> <tr><td>Heading Level 3</td></tr> </table> <tr><td>Heading Level 2</td></tr> </table> <tr><td>Heading Level 1</td></tr> </table>

    Thanks,

    Gopal.R.

Re: convert text to HTML Format Table.
by ww (Archbishop) on Jan 27, 2005 at 17:16 UTC

    if your text file already has (literal) <h1>, etc, why don't you just copy it to a new file and rename that foo.htm? (You'll probably have to add at least minimal html headers and close the heading tags, but that's a different script). For most in-house purposes that occur to me (insufficient imagination may be a work here), that would be fairly simple.

    If your output is intended for the global population of the web, please reconsider nesting tables.

    Most current browsers don't exact the kind of speed penalties we saw under NS 4 and IE5 (data lacking for Konqueror, Safari, etc), but there is still some... and another for failure to specify width ... preferably with css (tho, personally, because my logs tell me there are still quite a few users of NS4.7, IE 5.0, etc., I stick widths (as %) in <table... and <td elements to cope with those browsers' poor/flaky handling of css and inheritance.

    If your H1, 2, 3 are intended to resemble classic 7th grade outlining", the suggestions above look sensible; if you're not dealing with a simple outline, you probably should consider someting like:

    open (test that its open: use die!!!) and read original file into array
    open a filehandle (OFH) for output (and test)
    print OFH "<html><head><title>$1</title></head><body><table><tr>";
    (where $1 is the orig filename)
    foreach $line(@array) { print OFH "<td>$line</td></tr><tr><td>"; } print OFH "/td></tr></table></body></html>"; close, etc...

    This will keep the <Hn tags from your original, though neither approach I've mentioned will keep the indentation.

    add newlines to taste if you're going to need to read/tweak the html

Re: convert text to HTML Format Table.
by Plankton (Vicar) on Jan 27, 2005 at 16:04 UTC

    This may not be helpful at all, but if you have any control over the format of your input file I would suggest you investigate using DocBook.