Gavin has asked for the wisdom of the Perl Monks concerning the following question:

Hi Brethren,
I am trying to count the number of times "trigger words" held in an array appear in a text file held in a hash.
The lines of text are split into sentences and held separately in the hash. If the trigger word occurs more than once I would like to be able to count multiple occurrence of the same word.
Got so far but am now a bit stuck!
foreach my $key2 ($keys %UnNum){ foreach my $word(@midWords){ if $word == $key2 { $UnNum{$key}++; } } }
Any pointers where I am going wrong would be much appreciated This is the content of the Array the trigger words
bodi forc twice mass acceler law

The content of the %UnNum is in the format
35 forc two bodi twice strong on bodi sai bodi mass 39 you expect on think new bodi made two bodi origin 60 attract bodi b origin force 46 total forc b twice origin force 34 sai on bodi twice mass three time mass forc six 17 on now see why bodi fall same rate bodi twice
I would like to increase the value for each line by 1 for each occurrence of "bodi", "forc" "twice" etc. First line value up by 5 to 40
40 forc two bodi twice strong on bodi sai bodi mass 41 you expect on think new bodi made two bodi origin
Hope that helps to explain better.

Replies are listed 'Best First'.
Re: Counting instances in a hash from an array
by ikegami (Patriarch) on Mar 23, 2006 at 19:50 UTC

    When comparing strings, use eq, not ==. Using the latter will convert the strings into numbers before comparing them.

    The proper syntax for if is if (...) { ... }. The parens are missing in your code.

    Update: The OP updated his node to include sample data. It seems he is still storing the score (a number he wishes to change) as the key to the hash. He was told this was a bad idea, and he was given several solutions to his question in an earlier post.

      Is he? It looks more like he's trying to store the count as the hash-value (reasonable imho), and the word as the key - what have I missed?

      Tom Melly, tom@tomandlu.co.uk
        If you look at the code, he's incrementing the value, but if you look at his data (pasted below), the number to increment is in the key. It's not clear from that data that the left column is a key -- it could be an HoA -- but in the past, he explicitely said the score (the number to increment) is in the key.
        35 forc two bodi twice strong on bodi sai bodi mass 39 you expect on think new bodi made two bodi origin 60 attract bodi b origin force 46 total forc b twice origin force 34 sai on bodi twice mass three time mass forc six 17 on now see why bodi fall same rate bodi twice
Re: Counting instances in a hash from an array
by GrandFather (Saint) on Mar 23, 2006 at 19:57 UTC

    Your sample code does not seem to relate to your question very much. You should provide some sample data, the output that is generated, and what you expected to be generated.

    The following follows the sense of your description, not of your code:

    use strict; use warnings; my @midWords = qw(stop start finish); my %sentences = ( 1 => "Start starts the start sentence.", 2 => "This is a sentence in the middle.", 3 => "This is not the start or the finish, nor even a good place t +o stop.", 4 => "This is the finish and a good place to stop." ); my %testWords; @testWords{@midWords} = (0); foreach my $sentence (values %sentences){ for (split /\W/, $sentence) { ++$testWords{lc $_} if exists $testWords{lc $_}; } } print join "\n", map {"$_ => $testWords{$_}"} sort keys %testWords;

    Prints:

    finish => 2 start => 3 stop => 2

    Read I know what I mean. Why don't you? for some tips on posting this sort of question.


    DWIM is Perl's answer to Gödel
Re: Counting instances in a hash from an array
by zer (Deacon) on Mar 23, 2006 at 19:54 UTC
    if $word == $key2 {$UnNum{$key}++);
    this adds 1 to $unNum {key} for every instance. What is the value of $key? also $keys2 is a changing variable due to the foreach. So it will rewrite any change you have made for it. Also $keys is a variable so it wouldnt pull the keys out of %UnNum.
    foreach my $key2 (keys %UnNum{ foreach my $word(@midwords){ $UnNum{$keys2}++ if ($keys2 eq $word); } }

    This might be what you mean

Re: Counting instances in a hash from an array
by eric256 (Parson) on Mar 23, 2006 at 22:33 UTC

    For hashes you want the key to be something that doesn't change, in fact you don't just want it to be, it has to be. So if we reverse your hash so that the sentence itself is the key your code would be:

    So you say...but i want my hash that way. And to that I say "no you don't". :) If you think about it, you will have problems if you use the keys. First of all there is no easy way to change the key of a value. You would have to set the new key to the value and then remove the old one. This would be fine except you can only have one value stored per key. Which means when your first sentence counts 4 occurences of something its key is now 39 just like your second sentence. At that point one of those sentences is oblitirated. This is easily solved by storing the value that you want to change as, you guessed it, the value.

    The code below does what you want, i think. First it builds a hash with your sentences and there current counts. Then it builds an array of words to match. I take the array of words you want to match agianst and join them together with | to make a regex that will match any one of the words (thats what the | means). Then i compile that together into a regex with \b's to say that you only want to match those on word boundaries (i.e. your word has to be the whole word and nothing more). From there we just loop over the keys in the hash, matching those keys against the regex and incrementing there value for every match found. I hope this helps you solve your problem.

    use strict; use warnings; use Data::Dumper; my %UnNum = ( "forc two bodi twice strong on bodi sai bodi mass" => 35, "you expect on think new bodi made two bodi origin" => 39, "attract bodi b origin force" => 60, "total forc b twice origin force" => 46, "sai on bodi twice mass three time mass forc six" => 34, "on now see why bodi fall same rate bodi twice" => 17 ); my @midWords = qw/bodi forc twice mass acceler law/; my $joined = join("|", @midWords); my $regex = qr/\b$joined\b/; print Dumper(\%UnNum); for my $sentence (keys %UnNum) { $UnNum{$sentence}++ while $sentence =~ /$regex/g; } print Dumper(\%UnNum);

    ___________
    Eric Hodges

      I can't help but feel that the original poster has been misunderstood, but maybe I'm grasping the wrong end of the stick...

      Tom Melly, tom@tomandlu.co.uk
Re: Counting instances in a hash from an array
by ercparker (Hermit) on Mar 24, 2006 at 00:10 UTC
    hope this helps:
    use strict; use warnings; my @array = qw(bodi forc twice mass acceler law); my %UnNum = ( 'forc two bodi twice strong on bodi sai bodi mass' => 35, 'you expect on think new bodi made two bodi origin' => 39, 'attract bodi b origin force' => 60, 'total forc b twice origin force' => 46, 'sai on bodi twice mass three time mass forc six' => 34, 'on now see why bodi fall same rate bodi twice' => 17, ); foreach my $text (keys %UnNum) { map { $UnNum{$text}++ while $text =~ /\b$_\b/g; } @array; }
Re: Counting instances in a hash from an array
by Melly (Chaplain) on Mar 23, 2006 at 23:24 UTC

    Nearly there, as far as I can see

    • 'keys' is a keyword/function in this contect, not a variable, so 'keys' not '$keys'
    • '==' is numeric, not string - use 'eq'
    • See revised code below for your hash refs
    • foreach my $keyLOOK(keys %UnNum){ foreach my $word(@midWords){ if $word == $keyLOOK { $UnNum{$keyLOOK}++; } } }

      Or, alternatively..

      foreach(keys %UnNum){ foreach my $word(@midWords){ if $word eq $_ { $UnNum{$_}++; } } }
      Tom Melly, tom@tomandlu.co.uk
Re: Counting instances in a hash from an array
by GrandFather (Saint) on Mar 24, 2006 at 00:18 UTC

    Looks to me like your data structures are arse backwards. Consider:

    use strict; use warnings; my %triggers; @triggers{qw(bodi forc twice mass acceler law)} = (1); my @UnNum = ( [35, [qw(forc two bodi twice strong on bodi sai bodi mass)]], [39, [qw(you expect on think new bodi made two bodi origin)]], [60, [qw(attract bodi b origin force)]], [46, [qw(total forc b twice origin force)]], [34, [qw(sai on bodi twice mass three time mass forc six)]], [17, [qw(on now see why bodi fall same rate bodi twice)]], ); for (@UnNum) { my $count = \$_->[0]; exists $triggers{$_} && ++$$count for @{$_->[1]}; } print join "\n", map {"$_->[0]: " . join ' ', @{$_->[1]}} @UnNum;

    Prints:

    41: forc two bodi twice strong on bodi sai bodi mass 41: you expect on think new bodi made two bodi origin 61: attract bodi b origin force 48: total forc b twice origin force 39: sai on bodi twice mass three time mass forc six 20: on now see why bodi fall same rate bodi twice

    DWIM is Perl's answer to Gödel
      Thanks to all for their input.
      My text file is in the format
      print OUTLET "$count\t $_";
      Which gives me
      81, galileo measur us newton basi law motion 45, galileo experi bodi roll slope act same forc weight effect make co +nstantli speed up 50, show real effect forc chang speed bodi set move previous thought 45, meant bodi act forc keep move straight line same speed
      When fed into an array it does not give me the input suggested by "Grandfather".
      I'm not sure how to get qw function syntax placed correctly.
      Again many thanks for all help give.
      my @UnNum = ( [35, [qw(forc two bodi twice strong on bodi sai bodi mass)]], [39, [qw(you expect on think new bodi made two bodi origin)]], [60, [qw(attract bodi b origin force)]], [46, [qw(total forc b twice origin force)]], [34, [qw(sai on bodi twice mass three time mass forc six)]], [17, [qw(on now see why bodi fall same rate bodi twice)]], );

        You need to build the array from the input data. Note that this appends the data in a __DATA__ section following the code. To use an external file instead, open it and use the file handle in place of DATA in the while loop.

        Try this version:

        use strict; use warnings; use strict; use warnings; my %triggers; @triggers{qw(bodi forc twice mass acceler law)} = (1); my @UnNum; while (<DATA>) { chomp; next if ! /(\d+),/; push @UnNum, [$1, [split /\s+/]]; } for (@UnNum) { my $count = \$_->[0]; exists $triggers{$_} && ++$$count for @{$_->[1]}; } print join "\n", map {"$_->[0]: " . join ' ', @{$_->[1]}} @UnNum; __DATA__ 81, galileo measur us newton basi law motion 45, galileo experi bodi roll slope act same forc weight effect make co +nstantli speed up 50, show real effect forc chang speed bodi set move previous thought 45, meant bodi act forc keep move straight line same speed

        Prints:

        82: 81, galileo measur us newton basi law motion 47: 45, galileo experi bodi roll slope act same forc weight effect mak +e constantli speed up 52: 50, show real effect forc chang speed bodi set move previous thoug +ht 47: 45, meant bodi act forc keep move straight line same speed

        DWIM is Perl's answer to Gödel