Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re: Simplify parsing a file

by valdez (Monsignor)
on Apr 02, 2007 at 17:36 UTC ( #607887=note: print w/replies, xml ) Need Help??

in reply to Simplify parsing a file

For the parsing part I would use HTML::TokeParser::Simple, wrote by our brother Ovid; here it is an example from the documentation:

use HTML::TokeParser::Simple; my $p = HTML::TokeParser::Simple->new( $somefile ); while ( my $token = $p->get_token ) { # This prints all text in an HTML doc (i.e., it strips the HTML) next unless $token->is_text; print $token->as_is; }
Nice, isn't it? HTML parsing is not easy as it may seem, relying on a well written module is not a sin :)

Ciao, Valerio

Replies are listed 'Best First'.
Re^2: Simplify parsing a file
by myrrdyn (Novice) on Apr 02, 2007 at 18:30 UTC
    Wow. That is sweet. I'll post my implementation of it when I get a chance. Still, any thoughts on the multiple-file issue, or do you think this will address that too?

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://607887]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2022-05-23 11:33 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (81 votes). Check out past polls.