Re: counting lines in perl
by Tanktalus (Canon) on Feb 26, 2005 at 19:19 UTC
|
What do you have so far? It's much easier to help you if we can see what you've done wrong.
You are aware that uniq only removes consecutive repeated lines, right? So the trick is to only keep track of the last line, and the count of the last line. If the current line is identical, increment the count, otherwise print it out with the count and set a new last line. The second trick is that when you're done with the file, you'll have a last line that isn't printed out, so you'll have to handle that, too.
| [reply] |
|
|
#!/usr/bin/perl
# uniq.pl: remove repeated lines.
use English;
use diagnostics;
$oldline = "";
$n = 0;
while ($line = <>) {
unless ($line eq $oldline) {
$n = $n + 1;
print " $n $line";
}
$oldline = $line;
}
I know that this is not right, it prints out just a straight increment of the output lines. I think that I need to combine the process so that the count stops at the end of each set of lines which I can do, but I can't work out how to print only the single line along with the number?
Edit by BazB - add code tags.
| [reply] [d/l] |
|
|
#!/usr/bin/perl
# uniq.pl: remove repeated lines.
use strict;
use diagnostics;
$oldline = "";
$n = 1;
while ($line = <>) {
if ($line eq $oldline) {
#$n = $n + 1;
$n++;
} elsif ($oldline) {
print " $n $oldline";
$n = 1;
$oldline = $line;
}
}
if ($oldline)
{
print " $n $line";
}
That should help. I'm not sure why you're using English. You should use strict. You always have a count of at least one - not zero. What we're doing now is checking - if the lines match, increment the count. If they don't match, print out the last match, and then reset. Finally, when we're done, we'll print out the last line.
Hope that helps.
(Warning - untested.)
Update: Of course, being untested, crashtest points out an obvious error... had $line when it should be $oldline. | [reply] [d/l] |
|
|
|
|
|
Re: counting lines in perl
by sh1tn (Priest) on Feb 26, 2005 at 19:29 UTC
|
use Data::Dumper;
my $lines;
while( <DATA> ){
/^\s*$/ and next;s/\n//;
$lines->{$_}{count}++;
push @{$lines->{$_}{linenum}}, $.
}
print Dumper($lines);
__DATA__
one
one
aaa
bbb
ccc
aaa
__END__
'one' => {
'count' => 2,
'linenum' => [
'2',
'3'
]
},
'bbb' => {
'count' => 1,
'linenum' => [
'5'
]
}
...
| [reply] [d/l] |
Re: counting lines in perl
by chas (Priest) on Feb 26, 2005 at 21:13 UTC
|
If I understood what uniq -c is supposed to do, how about:
while (<>){
$i++;
chomp;
$lines[$i]=$_;
$times{$lines[$i]}++ if $lines[$i] ne $lines[$i-1];
};
@keys = keys %times;
@values = values %times;
while (@keys) {
print pop(@values), ': ', pop(@keys), "\n";
}
(One could likely make this more brief at the expense of readability.)
chas | [reply] [d/l] |
Re: counting lines in perl
by davidj (Priest) on Feb 27, 2005 at 05:21 UTC
|
#!/usr/bin/perl
use strict;
my (%words, $key);
open(FILE, "<test.txt");
while(<FILE>) {
chomp($_);
$words{$_}++;
}
close(FILE);
foreach $key (keys %words) {
print "$words{$key} $key\n";
}
exit;
davidj | [reply] [d/l] |
|
|
But that code doesn't seem to count groups of consecutive repetition just once, does it? - (which is what I thought the original poster wanted.)
chas
(Update: Actually, now that I've gone to a system where
I could try out uniq -c, I see that I misunderstood what was
desired so my code doesn't seem to do what the original poster
wanted. Your code is closer, but the output isn't the same as that of uniq -c, at least the version I used. Sorry about the confusion...)
| [reply] |
|
|
You are correct. My code is flawed. For some reason I thought uniq -c sorted the file first, but it doesn't. My mistake.
davidj
| [reply] |