comment on

I'm confused by some of your statements of your requirements.

#1. Sort alphabetically (ignoring capitalization).
#2. Sort alphabetically with upper case words just in front of lower case words with the same initial characters.
[Emphases added.]

These seem like two separate requirements. Do you want to do #1 first and then use the result to do #2, or do you want to do both and save both sets of results?

#3. Sort by frequency, from high to low, (any order for equal frequency).
#4. Sort by frequency, with alphabetical order for words with the same frequency.
[Emphases added.]

Again, these requirements seem at odds. Can you please clarify?

Please see Short, Self-Contained, Correct Example for info on providing example input and desired output and maybe also the actual code you've got so far. Maybe even see How to ask better questions using Test::More and sample data for a way to posit desired input/output examples.

Be that as it may, here's an approach to extracting words from a multi-line block of text and then sorting first alphabetically (upper-case first) and second by word frequency.

c:\@Work\Perl\monks>perl
use strict;
use warnings;

use Data::Dump qw(dd);  # for debug

my $text = <<'EOT';
Now is the time, now is the hour.
The rain in Spain falls mainly in Spain.
The rain in Spain falls mainly in Spain.
Foo foo foo Bar bar bar FOO BAR FOO BAR
EOT
print "[[$text]] \n";  # for debug

my $rx_word = qr{ \S+ }xms;

my @words = $text =~ m{ $rx_word }xmsg;
# dd \@words;  # for debug

my %word_count;
++$word_count{$_} for @words;
# dd \%word_count;  # for debug

my @sorted =
    sort { $a->[0] cmp $b->[0]  # sort first by alpha ascending
                   or
           $a->[1] <=> $b->[1]  # then by frequency ascending
         }
    map  [ $_, $word_count{$_} ],
    keys %word_count
    ;

dd \@sorted;  # for debug

print "'$_->[0]' ($_->[1]) \n" for @sorted;

__END__
[[Now is the time, now is the hour.
The rain in Spain falls mainly in Spain.
The rain in Spain falls mainly in Spain.
Foo foo foo Bar bar bar FOO BAR FOO BAR
]]
[
  ["BAR", 2],
  ["Bar", 1],
  ["FOO", 2],
  ["Foo", 1],
  ["Now", 1],
  ["Spain", 2],
  ["Spain.", 2],
  ["The", 2],
  ["bar", 2],
  ["falls", 2],
  ["foo", 2],
  ["hour.", 1],
  ["in", 4],
  ["is", 2],
  ["mainly", 2],
  ["now", 1],
  ["rain", 2],
  ["the", 2],
  ["time,", 1],
]
'BAR' (2)
'Bar' (1)
'FOO' (2)
'Foo' (1)
'Now' (1)
'Spain' (2)
'Spain.' (2)
'The' (2)
'bar' (2)
'falls' (2)
'foo' (2)
'hour.' (1)
'in' (4)
'is' (2)
'mainly' (2)
'now' (1)
'rain' (2)
'the' (2)
'time,' (1)
[download]

Note that, e.g., 'Spain' and 'Spain.' are extracted and counted separately because of the period at the end of one of them, and punctuation like , ; : ! ? ... will have a similar effect. This effect is due to the naive definition of the $rx_word regex; a better definition could eliminate such punctuation, but just what constitutes a "word" is tricky to define in general.

Note also that the entire content of a file can be read to a scalar string with the idiom
my $text = do { local $/; <$filehandle>; };
See perlvar for $/ info.

Update: The idiom used to produce the @sorted array

my @sorted =
    sort { $a->[0] cmp $b->[0]  # sort first by alpha ascending
                   or
           $a->[1] <=> $b->[1]  # then by frequency ascending
         }
    map  [ $_, $word_count{$_} ],
    keys %word_count
    ;
[download]

is known as a Schwartzian Transform (ST). Please see A Fresh Look at Efficient Perl Sorting for more info on this and other sorting idioms. Also see "How do I sort an array by (anything)?" in perlfaq4 and sort.

Give a man a fish: <%-{-{-{-<

In reply to Re: Help sorting contents of an essay by AnomalousMonk
in thread Help sorting contents of an essay by harmattan_

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.