in reply to counting lines in perl

What do you have so far? It's much easier to help you if we can see what you've done wrong.

You are aware that uniq only removes consecutive repeated lines, right? So the trick is to only keep track of the last line, and the count of the last line. If the current line is identical, increment the count, otherwise print it out with the count and set a new last line. The second trick is that when you're done with the file, you'll have a last line that isn't printed out, so you'll have to handle that, too.

Replies are listed 'Best First'.
Re^2: counting lines in perl
by imhotep (Novice) on Feb 26, 2005 at 20:01 UTC
    I have this,
    #!/usr/bin/perl # uniq.pl: remove repeated lines. use English; use diagnostics; $oldline = ""; $n = 0; while ($line = <>) { unless ($line eq $oldline) { $n = $n + 1; print " $n $line"; } $oldline = $line; }

    I know that this is not right, it prints out just a straight increment of the output lines. I think that I need to combine the process so that the count stops at the end of each set of lines which I can do, but I can't work out how to print only the single line along with the number?

    Edit by BazB - add code tags.

      Try some <code> tags.

      #!/usr/bin/perl # uniq.pl: remove repeated lines. use strict; use diagnostics; $oldline = ""; $n = 1; while ($line = <>) { if ($line eq $oldline) { #$n = $n + 1; $n++; } elsif ($oldline) { print " $n $oldline"; $n = 1; $oldline = $line; } } if ($oldline) { print " $n $line"; }
      That should help. I'm not sure why you're using English. You should use strict. You always have a count of at least one - not zero. What we're doing now is checking - if the lines match, increment the count. If they don't match, print out the last match, and then reset. Finally, when we're done, we'll print out the last line.

      Hope that helps.

      (Warning - untested.)

      Update: Of course, being untested, crashtest points out an obvious error... had $line when it should be $oldline.

        Actually Tanktalus, that won't work. You said, If they don't match, print out the last match, and then reset. But when you print $line instead of $oldline, you're printing the next unique line you've found, not the line you've been counting for. Here's the altered code, also fixed so that it now runs under use strict:
        #!/usr/bin/perl # uniq.pl: remove repeated lines. use strict; use diagnostics; my $oldline = <>; # Priming read my $n = 1; while (my $line = <>) { if ($line eq $oldline) { $n++; #$n = $n + 1; } else { print " $n $oldline"; $n = 1; $oldline = $line; } } if ($oldline) { print " $n $oldline"; }
        Note: The program will hang if there isn't at least one line of data to read.