in reply to Re^2: HTML::Treebuilder look_down not working with <header>, <article> etc
in thread HTML::Treebuilder look_down not working with <header>, <article> etc

And you're not telling treebuilder to keep unknown tags because?
  • Comment on Re^3: HTML::Treebuilder look_down not working with <header>, <article> etc

Replies are listed 'Best First'.
Re^4: HTML::Treebuilder look_down not working with <header>, <article> etc
by Anonymous Monk on May 07, 2014 at 14:36 UTC
    Because I didn't know of this feature =). I'm just copying together scripts and twist them. I'm not a real programmer. For everyone who is struggeling with the same problem. Here is the piece of code.
    //new_from_file parse it instantly so i have to make a new my $tree = HTML::TreeBuilder->new(); //set ignore_unkown to false $tree->ignore_unknown(0); //than parse the content $tree->parse_content($webcrawler->content()); print CONTENT $tree; if (my $div = $tree->look_down(_tag => "article" , class=>"article hen +try")) { print $div->as_text(), "\n"; } else { print "Not found"; } $tree->delete();
    This gives me the expected result. Thanks ^_^ Didn't know there was a ignore_unknown which is by dafault true.