I have a puzzling problem --

I have a small sub-routine that takes an html page, and extracts the first n characters from it not counting the html tags. Then, it closes any tags remaining open because of the extraction.

I was using HTML::TokeParser version 2.37 and everything was working fine on my laptop with Perl 5.8.8. Well, on Dreamhost, the darn thing started causing segmentation fault.

So, I substituted TokeParser with HTML::TagParser, a much simpler, less ambitious, but pure-Perl module. Again, it works fine on my laptop, but causes segfaults on Dreamhost. Fwiw, Dreamhost is running Perl 5.8.4, which may or may not be the cause (I hope that is not the cause).

TagParser is a pure Perl module, so it should just work, as far as I understand. But, not so... What can I do to solve this?

Update: I just checked and Dreamhost does have HTML::TokeParser version 2.24 installed. Initially I was using their module, but when I got segfaults, I installed my own instance at version 2.37. Still the segfaults.

--

when small people start casting long shadows, it is time to go to bed

In reply to segmentation fault on HTML::TokeParser by punkish

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.