Based on the comment in your code ("The HTML is stripped off the contents and the text is stored in an array of strings") you're assigning the content incorrectly. Note that I don't have WWW::Mechanize installed so can't double check the docs for that.

The @stripped_html is an array, just like you need. But $stripped_html[$x] is only one element in that array, which means that it's really a scalar1. Since the content sub returns an array, you're trying to assign an array to a scalar, and you'll end up with the number of things in the array.

You'll need to change your code a bit.

# Note that the $x isn't needed with this approach, # so I took it out. my @stripped_html; @stripped_html = $webcrawler->content( format => "text" ); # You can print the array directly, like this: print @stripped_html; # Or put it in a loop to specify what you want between # the array elements: for my $item (@stripped_html) { print "$item\n"; }
As is, this code prints out the HTML contents twice. Just so you can see the different ways to print an array, which wasn't your question so I'll stop blathering on about that now.

1 Yes, it could be another array or a hash or whatever, I'm talking simplest case scenario here.


In reply to Re: HTML stripper in WWW::Mechanize doesn't seem to work by Nkuvu
in thread HTML stripper in WWW::Mechanize doesn't seem to work by lampros21_7

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.