Sounds like a cool project. One potential pitfall is legal. Some sites don't like robots gathering information from their pages automatically, because (1) ads are not seen and (2) automatic collection of info could be a violation of copyright, depending on the use it is put to.
The solution is to check terms of use on the website, or ask the webmaster.