in reply to News with LWP::Simple?
The one thing I did find was that I needed to grab a page from each news source that I was going to display headlines from and see how the page was layed out. From there I wrote rules using regexes to parse out the relevant bits of information. My original version had 9 rules for 24 different web sites, while the newer version I got it down to 3 rules for about 90 sites. Now it's up to 4 with the RSS feeds I plan to add.
One thing to be aware of, though, is the possibility that you're violating copyrights by doing what you're doing. Make sure you check in to that.
Check my scratchpad for a quick and dirty tokeparser program that will spit out the layout of a page by tokens.
Good luck!
There is no emoticon for what I'm feeling now.
|
|---|