in reply to Search for repeating but slightly different patterns

assuming you (the developer) know ahead of processing which site will produce which variant of presentation...you can have unique code (per site) to scrape and translate into a stable design internal representation, and from there on, your code always just works with the internal representation. so in effect this strategy de-couples the initial read of the html from the rest of the code. the hard part is determining which form of internal representation (data structure) will work for all cases, and give you consistent access to that data.
the hardest line to type correctly is: stty erase ^H
  • Comment on Re: Search for repeating but slightly different patterns