Hey all, I have a large file that contains UNIX groups, and under each group is the users within the group. Now, each user has a number associated with them; so like user1 would have a line like this: "user1:user1,CO#". Lets say every group has roughly 300 users. I have written a script below that will count all the CO#s and tell me the one that occurs most, or if there is a tie it will tell me the ones that occur most. Anyway, I want to modify the script below to tell me the one that occured the most occurs x out of y times. So, in other words, X occurs the most times, and X occurs 90/100 times, if that makes sense, and maybe the other 10 times would be Y, but I don't really care about that. Anyway, code below, need any clarification please do let me know.
Here's a better example:
user1:user1,CO12345
user2:user2,CO12345
user3:user3,CO12345
user4:user4,CO54321
user5:user5,CO54321
user6:user6,CO12345
So, CO12345 is the most popular number, and CO12345 occurs 4 out of 6 times.
#!/usr/bin/perl -w
unshift(@INC,"$ENV{'PWD'}");
use Util qw(max) ;
my $file_name;
my $ans;
if (@ARGV == 1) {
chomp ($file_name=$ARGV[0]);
}
else {
print "\n\nPlease enter the file name to certify: ";
chomp ($file_name=<STDIN>);
while(1) {
print "\n\nYou entered $file_name - is this correct? <y or n>:";
chomp ($ans=<STDIN>);
if ($ans =~ /[Nn]/) {
print "\n\nPlease enter the server name: ";
chomp ($file_name=<STDIN>);
next;
}
elsif ($ans =~ /[Yy]/) {
last;
}
else {
next;
}
}
}
open(STUFF,"<$file_name") or die "$!";
my %number_ids = () ;
while (my $line = <STUFF>) {
$line =~ s/\s+\z// ;
$line =~ s/\A\s+// ;
my @csv = split(/:/, $line) ;
if (defined($csv[1]) && ($csv[1] =~ m{((CO)\d{5})})) {
$num = $1;
$number_ids{$num}++;
}
elsif (defined($csv[0]) && ($csv[0] =~ m/^Group.*/i)) {
show_most_popular(\%number_ids);
print "$csv[1],";
%number_ids = ();
}
else {
# what to do we do with peculiar lines?
};
};
show_most_popular(\%number_ids);
sub show_most_popular {
my ($r_ids) = @_ ;
return if !%$r_ids ;
my $max = max(values %$r_ids) ;
my @popular = () ;
while (my ($id, $count) = each %$r_ids) {
if ($count == $max) { push @popular, $id ; } ;
} ;
print join(',', sort @popular), "\n";
} ;
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.