gayagrad has asked for the wisdom of the Perl Monks concerning the following question:

I want to use hash to count occurrences of words in a column of a file. How do I use hash for this. I am new to perl and am unable to figure out how to use hash for this.The column looks like: SINE LINE LINE LINE SINE Alu Alu SINE..... and so on... the words that follow are different but are repeated frequently. I want to display the output like SINE = 3 LINE =3 ALU =2 AND so on. how do I work it out with %hash?

Replies are listed 'Best First'.
Re: hash to count occurences
by LanX (Saint) on Oct 25, 2013 at 17:24 UTC
      Thank you
Re: hash to count occurences
by davido (Cardinal) on Oct 25, 2013 at 17:37 UTC

    The shortest path (we all love short paths) to getting a good answer to this question would be to follow-up in this thread with a brief example of the input data that includes examples of any possible quirks the data might exhibit. Also post 20 lines or less of what code you've already written to get started in the project. Then ask a specific question about the portion of the project where you're stuck.

    Having read perlintro first will enable you to better understand the advice that follows. perlintro should take about 30 minutes to get through, and it's a reasonable expectation on our part that you become familiar with it first.


    Dave

Re: hash to count occurences
by Lennotoecom (Pilgrim) on Oct 25, 2013 at 22:14 UTC
    Actual example:
    foreach (split /\s/, <DATA>){ $hash{$_}++; } foreach $key (sort keys %hash){ print "$key $hash{$key}\n"; } __DATA__ SINE LINE LINE LINE SINE Alu Alu SINE
    output will be
    Alu 2 LINE 3 SINE 3
    that __DATA__ thing at the end of the script
    is kind of an emulation of a real file inside script - handy.
    In the first line we take a single line from the <DATA>
    and split it by spaces (\s) into $_ variable one by one.
    Then we put into hash values which are sent by foreach and
    increment them if they are exist and if not,
    they are automatically created and incremented right after creation.
    And the second foreach cycle print keys and values from the
    newly created hash.
Re: hash to count occurences
by Anonymous Monk on Oct 25, 2013 at 17:27 UTC
Re: hash to count occurences
by Anonymous Monk on Oct 25, 2013 at 18:14 UTC
    The "secret sauce" is in the line such as $myhash->{'mykey'}++. Welcome to what Perl calls "auto-vivification." If a hash-entry does not yet exist, it will be created with the initial value of 1. If it does exist, it will be incremented. So you can write this one statement to "do what you want."

    After looping through all the data, iterate through the keys in the hash and look for values greater than 15.
      almost

      > If a hash-entry does not yet exist, it will be created with the initial value of 1.

      the initial value is undef, when incremented undef is interpreted as 0.

      Cheers Rolf

      ( addicted to the Perl Programming Language)