comment on

Rewriting all of whats mentioned above in what I think might help a person totally new to Perl:

use strict   ;
use warnings ;


my %count ;
while( my $line = <DATA> ) { # Read lines from DATA, you can 
                             #     replace this with a file handle ( F
+H ).
    
    # First break down a single line into words - 
    #      We assume that words are white space separated. 
    #      To include others such as '-' you woould replace
    #         /\s/ with /[\s-]/ 
    my @words_in_this_line = split( /\s/, $line ) ;

    # Now we flip through the words within a single line. 
    foreach my $word ( @words_in_this_line ) { 

    # Lowercase it to ensure that repeats in different 
        #     cases are not recounted. 
    $word = lc( $word ) ;

    # Check if there is a number contained in this word, 
        #    we move to the 'next' iteration if there is
        #    Notice that the condition is after the statement 
        #       that is executed if the condition is True.
    next if( $word =~ /\d+/ ) ;


    
    if( defined( $count{ $word } ) ) { 
        # If I have seen the word before then increment my count. 
        $count{ $word } ++  ;
    } else { 
        # What if I have never seen this word - Then I need to set cou
+nt as 1;
        $count{ $word } = 1 ;
    }

    } # End of loopint through words. 


} # End of looping through lines in file. 



# Your - sort { $count{$b} <=> $count{$a} || $a cmp $b} keys %count
#    Lets break it up:

# We stored it so the key is the word and the value the count 
#   This ordering was intentional so as to ensure that we can 'quickly
+'
#   figure out if we have seen a word before. 
my @uniq_words_in_file = keys %count ;

# We use the brilliant sort function that allows you to tell it what t
+he
#   comparison should be.
@uniq_words_in_file = 
    sort(
    { $count{$b} <=> $count{$a} || $a cmp $b } 
    @uniq_words_in_file ) ;
# This one bit brings out the beauty of Perl: 
#   We are passing a Subroutine to the subroutine 'sort' 
#   'sort' will use this sub to compare elements during the sort. 
#   notice that <=> will return -1, 0 or 1 and when 
#   $count{ $b } is equal to $count{ $a }, '<=>' will return 0.
#
# Now every line in evaluates to a value and Perl uses Lazy evaluation
+.
#   What this means is that as it evaluates a boolean 'OR' it will 
#   stop evaluating expressions after it finds a true value 
#   ( because True OR anything is always True )
# 
# We use this to additionally compare $a and $b as strings this time
#   when the counts are equal. 



# And now the printing. 

foreach my $word ( @uniq_words_in_file ) { 
    
    print "'$word'\tOccurred\t$count{ $word }\ttimes\n";

}

__DATA__
This these that the and how who writ this code
1 how now brown cow 1asdf 23 
the fox jumped into 123 the hencoop
the lazy brown 2134 dog was azleep.
[download]

And now the code again with no comments:

use strict   ;
use warnings ;


my %count ;
while( my $line = <DATA> ) {

    my @words_in_this_line = split( /\s/, $line ) ;

    foreach my $word ( @words_in_this_line ) { 

    $word = lc( $word ) ;
    next if( $word =~ /\d+/ ) ;

    if( defined( $count{ $word } ) ) { 
        $count{ $word } ++  ;
    } else { 
        $count{ $word } = 1 ;
    }

    } # End of loopint through words. 


} # End of looping through lines in file. 

my @uniq_words_in_file = keys %count ;

@uniq_words_in_file = 
    sort(
    { $count{$b} <=> $count{$a} || $a cmp $b } 
    @uniq_words_in_file ) ;


foreach my $word ( @uniq_words_in_file ) { 
    
    print "'$word'\tOccurred\t$count{ $word }\ttimes\n";

}

__DATA__
This these that the and how who writ this code
1 how now brown cow 1asdf 23 
the fox jumped into 123 the hencoop
the lazy brown 2134 dog was azleep.
[download]

In reply to Re: compute the occurrence of words by tmharish
in thread compute the occurrence of words by BigGer

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.