Sly_G has asked for the wisdom of the Perl Monks concerning the following question:
that have a little differences inside - that differences are names, prices, photos, etc. Obviously, one have to change parsing repexp every time the page desing or markup changes. HTML stripping technique doesn't always help, because I have to have image urls and other html information on products. The best solution would be an algorithm that could find repeating chunks and return the differences they have - that's the actual data I'm digging for. For example, if there is 20 products on a page, there's 20 similar chunks of code, and their differences are my data. Maybe there's such CPAN module or smth. Thanks!<table> <tr> <td>........</td> </tr> </table>
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Search for repeating but slightly different patterns
by CountZero (Bishop) on Oct 24, 2010 at 17:32 UTC | |
|
Re: Search for repeating but slightly different patterns
by jethro (Monsignor) on Oct 24, 2010 at 17:08 UTC | |
by thargas (Deacon) on Oct 25, 2010 at 12:33 UTC | |
by Sly_G (Novice) on Oct 27, 2010 at 16:31 UTC | |
|
Re: Search for repeating but slightly different patterns
by JavaFan (Canon) on Oct 24, 2010 at 17:06 UTC | |
|
Re: Search for repeating but slightly different patterns
by planetscape (Chancellor) on Oct 25, 2010 at 06:10 UTC | |
|
Re: Search for repeating but slightly different patterns
by aquarium (Curate) on Oct 24, 2010 at 22:43 UTC |