Syntactic Confectionery Delight | |
PerlMonks |
Re: Finding largest common subset in lists?by BrowserUk (Patriarch) |
on Jun 05, 2003 at 09:49 UTC ( [id://263263]=note: print w/replies, xml ) | Need Help?? |
Update DO NOT USE! THIS IS A TERMINALLY BROKEN ALGORITHM THAT GET THE RIGHT ANSWER SOMETIMES ALMOST BY LUCK. I DOUBT IT CAN BE EASILY FIXED. Must test more! Must test more! Whether this is 'cleaner' is doubtful, given this is the first time ever I've felt the need to use redo, but if your big set is very big, is should result in less total memory use and less comparisons. Ie. It should be faster, I think. The basic premise to first locate all the subsets of @a that contain only values that exist in @b, and then use join and index to verify if the subset actually exists in @b.
It finds the apppropriate subset of your sample data, and works quite quickly for various datasets I generated randomly. Full testing is left to you. It handles duplicates in either set (collection:), and could probably be more efficient if this isn't a requirement. Caveats: The usual one about elements that could contain your chosen seperator which you are probably aware of. As is, it will only find the first, longest common subset. If two or more, equally long, subsets exist and you want them all, you have a little work left to do:) Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
In Section
Seekers of Perl Wisdom
|
|