I have an array that contains over 1 million strings of varying sizes from 10 characters to 250 characters. Each string is a set of two or more values separated by pipes ("|"). I need to eliminate any strings that are substrings of other strings within the array. For example, if the array contains ("A|B|C", "A|B|C|D|E"), then "A|B|C" should be dropped and "A|B|C|D|E" should be kept. I tried using "any" from List::MoreUtils, but it either kept everything or removed everything. Nothing I tried within the BLOCK worked: push(@arrCompletedChains, $strChain) if any { index($_, $strChain) < 0 } @arrWorkingCompletedChains;
Here is the code I am currently using, but it takes an extremely long time to run:
for my $strChain (@arrWorkingCompletedChains) { my $found = false; foreach (@arrWorkingCompletedChains) { if ($strChain ne $_ && index($_, $strChain) >= 0) { $found = true; last; } } if (!$found) { push(@arrCompletedChains, $strChain); } }
Any suggestions on how to improve the speed of this code would be greatly appreciated.
In reply to Best method to eliminate substrings from array by catemp
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |