in reply to Re^2: RFC: HTML::ListScraper
in thread RFC: HTML::ListScraper
Thanks for the module. I was looking for something similar for a while. The name did not clearly tell me what the module was doing.
I installed HTML::ListScraper. The document talks about the example script scrape. This does not get installed with cpan install. I have to go back to the distribution to get the scrape script. This is just a small inconvenience.
When I tried it on my example HTML file, I found that the approximation is splitting into finer blocks. I could not figure out a way to tune this parameter. Also, I would have liked to try approximation if the exact repetition (something like a suffix tree + largest repeating string combination) fails.
Thanks once again.