http://qs1969.pair.com?node_id=1014599

Dr Manhattan has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I have a frequency list (hash) with words and frequencies, for example:

the 5

a 4

good 3

If a word that starts with a capital letter exists somewhere else in the hash as a lowercase word, I want to delete the uppercase word and add its frequency to the lowercase words' frequency. I tried this:

foreach my $keys (keys %hash) { foreach my $z (keys %hash) { if ($hash{lc $keys} = $hash{$z}) { $hash{$z} = $hash{$z} + $hash{$keys}; delete $hash{$keys}; } } } foreach my $value(sort {$hash{$b} <=> $hash{$a}} keys %hash) #frequenc +y list { print "$value\t\t$hash{$value}\n"; }

The frequency list works fine, but when I add the first part it doesn't give any output. Please help

Replies are listed 'Best First'.
Re: Hash element exists/delete
by tobyink (Canon) on Jan 22, 2013 at 08:52 UTC

    There's no need to for two nested loops. That will get really slow when %hash gets big.

    use strict; use warnings; use utf8::all; my %hash = ( the => 5, good => 3, The => 2, Bad => 2, uGly => 1, uglY => 1, École => 1, ); for my $key (keys %hash) { next unless $key =~ /^\p{Uppercase_letter}/; my $lckey = lc $key; $hash{$lckey} += delete $hash{$key} if exists $hash{$lckey}; } for my $key (sort {$hash{$b} <=> $hash{$a}} keys %hash) { print "$key\t$hash{$key}\n"; }

    Update: replaced /^[A-Z]/ match with a Unicode-aware regexp.

    package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name
    }) } say Cow-
Re: Hash element exists/delete
by vinoth.ree (Monsignor) on Jan 22, 2013 at 08:50 UTC

    if ($hash{lc $keys} = $hash{$z}) You are using "=" Assignment Operator ?

    Update:
    use strict; use warnings; use Data::Dumper; my %final=(); my %hash =( 1=>'the', 2=>'The', 3=>'word', 4=>"Word",5=>'ree',6=>"REE" +); foreach my $keys(keys %hash) { foreach my $keys1(keys %hash) { if(lc($hash{$keys}) eq $hash{$keys1}) { $final{lc($hash{$keys})}++; } } } print Dumper \%final;

    Is that Useful? Done in your way.

      Also, you dont need the second loop - You can make use of the fact that you are using a hash.

      use strict ; use warnings ; my @data = <DATA> ; my %freq_hash ; # What you call '%hash' foreach my $single_line ( @data ) { chomp( $single_line ) ; my ( $word, $freq ) = split( / /, $single_line ) ; $freq_hash{ $word } = $freq ; } ## So far all we have done is get data ## print "\nData Before Process:\n"; foreach my $word (sort { $freq_hash{ $b } <=> $freq_hash{ $a } } keys +%freq_hash ) { print "$word\t\t$freq_hash{$word}\n"; } foreach my $word ( keys %freq_hash ) { if( ( lc( $word ) ne $word ) and $freq_hash{ lc( $word ) } ) { $freq_hash{ lc( $word ) } += $freq_hash{ $word } ; delete( $freq_hash{ $word } ) ; } } print "\nData After Process:\n"; foreach my $word (sort { $freq_hash{ $b } <=> $freq_hash{ $a } } keys +%freq_hash ) { print "$word\t\t$freq_hash{$word}\n"; } __DATA__ the 1 The 1 hello 15 hell 19 Hell 8

      Update: Included code.

Re: Hash element exists/delete
by roboticus (Chancellor) on Jan 22, 2013 at 11:06 UTC

    Dr. Manhattan:

    While there are a few interesting topics, no-one has mentioned that the easiest way to remove the upper-case characters is not to have any in the first place:

    my %H; while (<>) { my @words = map { lc } # map all words to lower case split /\s+/, $_; # break $_ into a list of words $H{$_}++ for @words; # add words to hash }

    If you needed the upper-case versions at some point, then this may not be useful. But people frequently miss the opportunity to clean up the data before storing it, so I thought I'd mention it, just in case.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Update:

      roboticus, You have assumed that the input is a list of words but it is of the format:

      Word Frequency Word Frequency Word Frequency Word Frequency

      Given this, the code should probably be:

      my %H; while (<>) { my ( $word, $freq ) = split /\s+/, $_; $word = lc( $word ) ; $H{ $word } += $freq ; }

      Although that does not change your primary thesis: people frequently miss the opportunity to clean up the data before storing it

      The way to achieve this with minimum changes to your code is in my original response:

      Original Response:

      I think you meant:

      $H{ $words[0] } += $words[1] ;

        tmharish:

        Yes, I was making an assumption about the input. For your input format, your code looks appropriate.

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re: Hash element exists/delete
by rovf (Priest) on Jan 22, 2013 at 08:53 UTC

    Adding to what ree already pointed out: If you would have used use warnings; in your code, Perl would have told you about the problem.

    -- 
    Ronald Fischer <ynnor@mm.st>

      Can you tell me what I am doing wrong:

      $ perl -e 'use strict; use warnings; my $one = 1; my $two = 2; if( $on +e = $two ) { print "True\n"; } else { print "False\n"; } ' True $

      I get no warnings!

        Because it's valid code, $two could be a boolean to be tested and as a side-effect $one is set.

        try

        perl -e 'use strict; use warnings; my $one = 1; if( $one = 2 ) { prin +t "True\n"; } else { print "False\n"; } ' Found = in conditional, should be == at -e line 1. True

        or

        perl -we '$a=1; print (($a = 2) ? "True" : "False");' Found = in conditional, should be == at -e line 1.

        cause now it makes less sense.

        YMMV!

        Cheers Rolf

        UPDATE

        And personally I would be glad if this would cause a warning, too!

        Only warns in more complex conditionals...

        perl -we'if ($z=1 or $z=2) { 1 }'

        ... I'm not sure of the exact set of circumstances that triggers it.

        package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name
Re: Hash element exists/delete
by BillKSmith (Monsignor) on Jan 22, 2013 at 14:43 UTC

    Deleting keys from the hash that you are iterating through may "work", but it is probably safer to create a new hash.

    Bill