in reply to A grammar for HTML matching
I would use it. I believe what he wants to do is match a
code block and extract the content. I have written several
small scripts that run via cron to collect some
information and file it away. I ended up using two
different methods to grab what I wanted.
The first was just
to scan the html looking for a comment line and grabbing
most everything after it. That was the easy one.
The second site was more complicated and the data I was
trying to extract was in a large table that changed size
depending on what they were displaying. I
didn't feel like learning html::parser at the time and I
hadn't found html::tableextract either. I cheated and
piped the page through lynx and grabbed what I wanted from
the parsed text output.
So neither of those methods would help you :-) but if you
put something like this together I would use it. I still
have to take a look at html::tableextract, but I'll get
around to it.
I've seen a few packages on freshmeat.net that will snag
comic strips off the web and put them somewhere for you.
They might have some good techniques for extracting that
stuff.
HTH