in reply to retain longest multi words units from hash

People have posted other solutions. This is probably no worse or no better…

use strict; use warnings; use Data::Dump qw/dd/; my %phrases = ( 'rendition' => '3', 'automation' => '2', 'saturation' => '3', 'mass creation' => 2, 'automation technology' => 2, 'automation technology process' => 3, 'technology process' => 5, 'automation process' => 2, ); sub filter_wordlist_thing { my %output = %{+shift}; for my $key (grep / /, keys %phrases) { my @words = split / /, $key; my @word_combos = grep $_ ne $key, map join(" ", @words[$_->[0]..$_->[1]]), map { my $start = $_; map [$start, $_], $start .. $#words +} 0 .. $#words; delete @output{@word_combos}; } \%output; } dd filter_wordlist_thing(\%phrases);

Replies are listed 'Best First'.
Re^2: retain longest multi words units from hash
by LanX (Saint) on Jul 29, 2018 at 21:21 UTC

      Yeah, pretty similar, but doesn't impose a predetermined maximum word length on the hash keys. I also optimize by skipping single-word hash keys.

        > but doesn't impose a predetermined maximum word length on the hash keys.

        That was my intention. I could have easily dynamically added a new level of slice if needed, but the check would have cost time. OTOH it's very likely the OP can assume a maximum word length, which is still very cheap if generously chosen.

        > I also optimize by skipping single-word hash keys.

        Again I thought checking for single words might cost more than just deleting an empty slice.

        I have to admit I could have put the body of partitions() into the delete-slice , but preferred clarity here and left optimization to the OP.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice