in reply to tutelage needed

Okay, a couple of things:

1 - Do you expect your file to be one long line? If not, you need to "slurp" the file, rather than doing what you're doing now (reading only the first line). Try $big_string = do {local $/; <TEXTFILE>};.

2 - Your first substitution statement does not require the outside capturing parens. It's just noisy.

3 - I wouldn't sort the array before doing the frequency count, as it just takes time for little gain.

4 - Finally, in response to your last question as to how to actually do the count, I have a few suggestions. An idiom that is useful is, for each item to say $hash{$word}++. It creates an entry if the word has not been seen before, and increments it if has. Use a for loop or a map statement to construct the hash. In the end, use a sort with a routine which sorts by the entries in the hash and (optionally) afterwards ascii-betically by the actual words.


Hope that helped!



Who is Kayser Söze?
Code is (almost) always untested.

Replies are listed 'Best First'.
Re: Re: tutelage needed
by ctp (Beadle) on Jan 01, 2004 at 02:54 UTC
    1- I thought the s modifier treated the string as one line. Am I reading the meaning of that wrong?

    2 - oh, cool...thanks. That's one of those cases where I start writing a regex, and tweak it repeatedly until it works...but then since, by some miracle, it does work I am reluctant to tweak it further :)

    3- yea - I wrote that line to see if I could knowing I might need it a little later.

    4- I've seen that form before, but I'm gonna try to figure out how to implement it. I have a map statement example here in one of my books that kinda is making sense to me. I'll give it a try.

    thanks!
      I don't see an s modifier anywhere. Am I missing something?


      Who is Kayser Söze?
      Code is (almost) always untested.
        $big_string =~ s/(\(|\))//g;

        after the tilde, before my too many parentheses