in reply to calculating lines

You don't want to use DATA as a user file handle. It's reserved for inline data after __END__, __DATA__, or Ctrl-Z.

I'll call the file handle $fh. Reading line by line,

while (<$fh>) {
convert all clusters of whitespace to a single space,
s/\s+/ /g;
get rid of initial whitespace to handle "\n\t" and such,
s/$\s//;
and print the lowercased line,
print lc; }
That's it. Each of those operations takes advantage of $_ as the default argument.

With that method, there's no need to do any special accounting of lines or to store any of your work as you go. If you do need to store the lines for some other purpose, you only need to push the output of lc onto an array where the print occurs.

If you just want a line count, you can immediately follow the while block with,

my $linecount = $. ;
$. is a running linecount of the most recently accessed read handle. It will be volatile in an application with several read handles, hence the assignment to a user variable.

After Compline,
Zaxo

Replies are listed 'Best First'.
Re^2: calculating lines
by Yoda_Oz (Sexton) on Jan 12, 2007 at 00:42 UTC
    i dont understand. all that does is removes all the characters and leaves behind all the punctuation. i want to remove all the punctuation and print out all the letters in one line with no spaces and stuff.

      Oops, I meant \s, whitespace, not \w, word characters. Corrected, sorry for the brain fart.

      After Compline,
      Zaxo

Re^2: calculating lines
by Yoda_Oz (Sexton) on Jan 12, 2007 at 01:35 UTC
    my $linecount = $. ;
    is code for a linecount...
    how do i do a word count?

      $wc += split; within the while loop will do it. That's another case of default arguments.

      After Compline,
      Zaxo

Re^2: calculating lines
by Yoda_Oz (Sexton) on Jan 12, 2007 at 01:02 UTC
    how do i remove the punctuation and whitespace characters too?

      Just use another character class than \s. Non-word, \W, might be what you want.

      If all else fails you can make a custom character class with square brackets. See perlre, the regex doc.

      After Compline,
      Zaxo

      never mind, worked it out...
      s/\s+//g; s/$\s//; s/([\041-\057]|[\72-\100]|[\133-\140]|[\173-\176])+//g; print lc;