fair enough, i didn't articulate my problem thouroughly. here's an example of the HTML i'm reading into the $text variable:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <DIV><FONT color="#0000ff" face="Arial" size="2"><SPAN class="54005130 +1-30062000">Steve,</SPAN></FONT></DIV> <DIV><FONT color="#0000ff" face="Arial" size="2"><SPAN class="54005130 +1-30062000"></SPAN></FONT>&nbsp;</DIV> <DIV><FONT color="#0000ff" face="Arial" size="2"><SPAN class="54005130 +1-30062000">The picture was one that you pointed me at in the paper a + couple of weeks ago. I don't have any pictures of mine yet.</SPAN></ +FONT></DIV> <DIV><FONT color="#0000ff" face="Arial" size="2"><SPAN class="54005130 +1-30062000"></SPAN></FONT>&nbsp;</DIV> <DIV><FONT color="#0000ff" face="Arial" size="2"><SPAN class="54005130 +1-30062000">Tom</SPAN></FONT></DIV> <BLOCKQUOTE> <DIV align="left" class="OutlookMessageHeader" dir="ltr"><FONT face="T +ahoma" size="2">-----Message-----<BR><B>From:</B> eat@joes.com [mailt +o:eat@joes.com]<BR><B>Sent:</B> Monday, June 10, 2005 3:50 PM<BR><B>T +o:</B> google.com<BR><B>Subject:</B> Re: [Test] another test<BR><BR>< +/DIV></FONT><TT>Tom wrote:<BR>&gt;OK, I finally figured out that you +can post online at the website or just<BR>&gt;send an e-mail.<BR><BR> +Oh and the pic... it looks like it was shot during an<BR>earthquake.& +nbsp; :-)<BR><BR>Steve<BR></TT><TT>To unsubscribe from this group, se +nd an email to:<BR>listmod@google.com<BR><BR></TT><BR></BLOCKQUOTE> <br><br> </div> </td></tr></table>

to do that, i use the following line in my script:

my $text = $stream->get_text ("/table");

this returns the following printed later in the script:

Steve, The picture was one that you pointed me at in the paper a couple of we +eks ago. I don't have any pictures of mine yet. Tom -----Message-----From:eat@joes.com [mailto:eat@joes.com] Sent:Monday, +June 10, 2005 3:50 PM To: google.com Subject: Re: [Test] another test + Tom wrote: OK, I finally figured out that you can post online at the + website or just send an e-mail. Oh and the pic... it looks like it w +as shot during an earthquake. :-) Steve To unsubscribe from this grou +p, send an email to: listmod@xxxxx.com

all of the HTML is stripped by nature of the operation, and that's great. i'm looking to keep the BR tags, however, so when i reimport the data elsewhere, it retains the formatting of the original (notice how the text at the bottom is all smashed together with no line breaks).

so what i meant earlier by no avail, i meant i was still getting the text all squashed together as above.

hope that made a little more sense.


In reply to Re^4: Tokeparser Textify Command by SpacemanSpiff
in thread Tokeparser Textify Command by SpacemanSpiff

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.