in reply to Re^3: Matching Many Strings against a Large List of Hash Keys (case insensitively, longest key first)
in thread Matching Many Strings against a Large List of Hash Keys (case insensitively, longest key first)

Now you are making the assumption keys cannot overlap. The OP never mentions that. Consider the keys "Two Three" and "Three Four Five" against the string "One Two Three Four Five". Your code will fail to return "Three Four Five" - after "Two Three" has been matched, no match in "Four Five" is found.
  • Comment on Re^4: Matching Many Strings against a Large List of Hash Keys (case insensitively, longest key first)

Replies are listed 'Best First'.
Re^5: Matching Many Strings against a Large List of Hash Keys (case insensitively, longest key first)
by repellent (Priest) on May 14, 2010 at 19:04 UTC
    Yes, that assumption comes from using an assembled regex match. Node has been updated again to make the assumption explicit.

    For kicks, let's try to handle overlapping keys using the assembled regex approach:
    use Regexp::Assemble; use Regexp::Exhaustive qw(exhaustive); use List::Util qw(reduce); my @keys = map { quotemeta } keys %hash; my $key_re = Regexp::Assemble->new->add(@keys)->re; for my $string (@strings) { my $match = reduce { length($a) > length($b) ? $a : $b } exhaustive($string => qr/\b($key_re)\b/i); print "Found '$match' in '$string'\n" if defined $match; }

    But then, the performance hit of using Regexp::Exhaustive removes any justification to use the assembled regex in the first place. Ho hum.