Counting lines by content

willa has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Counting lines by content by broquaint (Abbot) on Oct 09, 2002 at 11:19 UTC
You could handle this with a simple two-liner `perl -F: -e 'END { print "$_ $c{$_}\n" for keys %c }' \ -ane '$c{$F[1]}++'` [download] See `perlrun` for more info on perl's command-line options. HTH `_________ broquaint`	[reply] [d/l]
Re: Re: Counting lines by content by willa (Acolyte) on Oct 09, 2002 at 11:35 UTC
Thanks - I also need to make sure there are no duplicates in the third field. I assume this is easy too...	[reply]
Re: Re: Re: Counting lines by content by broquaint (Abbot) on Oct 09, 2002 at 11:45 UTC
This will count the second field ignoring duplicates `perl -F: -e 'END { print "$_ $c{$_}\n" for keys %c }' \ -ane '$c{$F[1]}++ unless $d{"@F[1,2]"}++'` [download] I assume this is what you meant no duplicates in the third field. HTH `_________ broquaint`	[reply] [d/l]
Re: Re: Re: Re: Counting lines by content by willa (Acolyte) on Oct 09, 2002 at 13:03 UTC
Re: Counting lines by content by sch (Pilgrim) on Oct 09, 2002 at 11:25 UTC
I think the easiest way would be to use a hash - something like this? Oops - just realised this doesn't handle the case where the 3rd field is a duplicate - in fact it doesn't worry about the 3rd field at all. guha has supplied some modifed code, which I've used to replace my slightly dodgy stuff! `#!perl use strict; use warnings; use diagnostics; my ($type, $desc, %fruit); open (FH, "y") \|\| die "Cannot find file"; while (<FH>) { (undef, $type, $desc) = split /:/; $fruit{$type}{$desc}++; } close FH; foreach my $type (keys(%fruit)) { print "$type : ",scalar keys %{ $fruit{$type} }, "\n"; }` [download]	[reply] [d/l]
Re: Re: Counting lines by content by l2kashe (Deacon) on Oct 09, 2002 at 13:47 UTC
A hash is the right answer, but I believe that he is looking for a hash of hashes.... `#!/usr/bin/perl open(IN,"/some/file") \|\| die "Cant open file\nReason: $!\n"; while (<IN>) { chomp($line = $_); ($first,$second) = (split(/:/, $line))[1,2]; $fruit{$first}{count}++; $fruit{$first}{$second}++; } close(IN); foreach $k (sort(keys(%fruit))) { print "$k $fruit{$k}{count}\n"; }` [download] The reason I used count as well, is so A) I dont have to loop to figure out what my total count for $first is, and B) I also can test for $fruit{$k}{blah} and determine if there were duplicates. You could add the test within the while loop.. I.e test for $fruit{$first}{$second} and if it exists warn or something, else increment it :)... Have fun /* * And the Creator, against his better judgement, wrote man.c */	[reply] [d/l]
Re: Counting lines by content by Anonymous Monk on Oct 09, 2002 at 15:57 UTC
sniff sniff... reminds me of... homework...	[reply]
Re: Counting lines by content by hackmare (Pilgrim) on Oct 10, 2002 at 11:05 UTC
I would use hashes of hashes and then count the number of entries. This is just a simple flat-file to tree generation question, akin to flatfile-to-xml constructor. #!/usr/bin/perl print "Hello, World...\n"; use strict; use Data::Dumper; my @in = qw/ fruit:apple:cox fruit:apple:pippin fruit:apple:granny fruit:banana:yellow fruit:banana:green /; #make an anonymous hash my $h = {}; foreach (@in) { my @a = split ':',$_; my $branch = shift @a; my $species = shift @a; my $breed = shift @a; $h->{$branch}->{$species}->{$breed} = $h->{$branch}->{$species}->{ +$breed} + 1 \|\| 1; } print Dumper($h); print "there are ".scalar (keys %{$h->{fruit}})." fruit species\n"; print "there are ".scalar (keys %{$h->{fruit}->{apple}})." apple breed +s\n"; print "there are ".scalar (keys %{$h->{fruit}->{banana}})." banana bre +eds\n"; print "Good luck with your homework\n"; [download] Returns: `C:\>perl test.pl Hello, World... $VAR1 = { 'fruit' => { 'apple' => { 'pippin' => 1, 'cox' => 1, 'granny' => 1 }, 'banana' => { 'green' => 1, 'yellow' => 1 } } }; there are 2 fruit species there are 3 apple breeds there are 2 banana breeds Good luck with your homework` [download] hackmare.	[reply] [d/l] [select]


We don't bite newbies here... much
	PerlMonks