walkingthecow has asked for the wisdom of the Perl Monks concerning the following question:

I have thousands of /etc/group files in a directory and thousand of corresponding /etc/passwd files in another directory. Now, given a group I am trying to get all the primary and secondary members of that group. I have an example below:


Line from /etc/group:
samplegrp::12345:bobjones,davejones

Lines from /etc/passwd
davana:1111:12345:Dan Vana:/davana:/usr/bin/ksh
bobjones:2222:54321:Bob Jones:/bobjones:/usr/bin/ksh
davejones:3333:6789:Dave Jones:/davejones:/usr/bin/ksh

So, as you can see the GID for samplegrp is 12345. It has one primary member (davana), and two secondary members. Now, like I said, thousands of group files (for many different servers) and thousands of passwd files). So, samplegrp could exist on hundreds of servers. I want all the primary/secondary members for this group on all servers where it exists. I have included a quick sample script that shows how I am doing this, and just looking for some suggestions. Thank you guys!!

#!/usr/bin/perl -w use strict; system("clear"); print "Please enter the group name that you're querying: "; chomp (my $group_name=<STDIN>); my @servers=`grep "^$group_name:" group/* | cut -d/ -f2 | cut -d. -f1` +; foreach my $server (@servers) { chomp $server; my $group_line=`grep "^$group_name:" group/$server.grp`; my ($groupName,undef,$gid,$secondaryMembers)=split(/:/,$group_line +); my @primary_members=`egrep ".*:[^A-Za-z]+:$gid:.*" passwd/$server. +passwd`; if (@primary_members) { print "<@primary_members>\n"; } if (defined $secondaryMembers ) { my @users=split(/,/,$secondaryMembers); foreach my $user (@users) { chomp $user; my $userInfo=`grep "^$user:" passwd/$server.passwd`; if ($userInfo) { print "[$userInfo]\n"; } } } }
Just to let you know, I am not trying to query one at at time like the script above, but rather go through a file with 900 lines, each line has one group. Also, I could parse through each group file and its corresponding file, but that doesn't make a lot of sense if the group only exists on one server, right? And finally, just for clarification, below I have given you an idea of what the layout of the group/passwd directories looks like:

passwd directory:
server1.passwd
server2.passwd
server3.passwd

group directory:
server1.grp
server2.grp
server3.grp

So basically group samplegrp could exist only on server150 out of server4500 (that is the total amount of servers), therefore it does not make sense to parse every file.

Replies are listed 'Best First'.
Re: Any way to do this without system calls?
by Perlbotics (Archbishop) on Feb 24, 2009 at 23:13 UTC

    You can replace system("clear") by print "\e[H\e[2J"; or use a Term::..CPAN module.

    Replacing the calls to (e)grep is possible but I guess, also slower. See glob or File::Find for methods to find all those group and password files. A simple grep emulation for a single file could be:

    sub my_grep { my ($filename, $re) = @_; open my $fh, '<', $filename or die "cannot open $filename - $!"; my @matching_lines = grep /$re/, <$fh>; close $fh or die "cannot close $filename - $!"; return @matching_lines; # might need further chomping? } my $group_re = qr{^samplegrp:}; foreach my $grpfile (glob "group/server*.grp") { print "$grpfile: ", my_grep($grpfile, $group_re), "\n"; }
    That's for the replacement of sytem calls - as requested.

    Alternatively, If those files fit into memory, you could read them once and then run multiple searches across them - but finally, a database might serve better in the long term...

Re: Any way to do this without system calls?
by graff (Chancellor) on Feb 25, 2009 at 05:02 UTC
    I really like Perlbotics suggestion about doing a relational database, and with SQLite, there's really no reason not to. Here's an initial stab at something you might want to try -- just plug in the actual paths for your group and passwd files.

    The main point of this script is to load three tables (groups per server, users per server, users per group per server), so if you have a static set of files to read, you would just run this once, and then create a separate script to allow users to run queries on the tables (for users in particular groups, groups for a particular user, across all servers or on particular servers, etc).

    I tacked on a global query to dump out the table contents after they're loaded, but if your data set is really huge, you might want to comment that out (or maybe add "limit 20" at the end of the select statement). You might want your tables to be defined differently as well.

    #!/usr/bin/perl use strict; use warnings; use DBI; my $db = DBI->connect( "dbi:SQLite:dbname=SrvrGrpData","","" ); $db->do( "DROP TABLE IF EXISTS srvr_grp" ); $db->do( "DROP TABLE IF EXISTS srvr_usr" ); $db->do( "DROP TABLE IF EXISTS grp_usr" ); $db->do( "CREATE TABLE srvr_grp (srvrid varchar(40), grpname varchar(4 +0), grpnum int)" ); $db->do( "CREATE TABLE srvr_usr (srvrid varchar(40), usrname varchar(4 +0), usrnum int, defgrp int)" ); $db->do( "CREATE TABLE grp_usr (srvrid varchar(40), grpid int, usrnam +e varchar(40))" ); my $ins_grp = $db->prepare( "insert into srvr_grp (srvrid,grpname,grpn +um) values (?,?,?)" ); my $ins_gu = $db->prepare( "insert into grp_usr (srvrid,grpid,usrname) + values (?,?,?)" ); my $grp_path = "/path/to/group_files"; my $usr_path = "/path/to/passwd_files"; chdir $grp_path or die "chdir $grp_path: $!\n"; opendir( D, "." ) or die "opendir $grp_path: $!\n"; while ( my $f = readdir( D )) { open( F, "<", $f ) or do { warn " open failed on $f: $!\n"; next } +; my ( $srvr ) = ( $f =~ /^(\w+)/ ); while (<F>) { next if ( /^#/ ); chomp; my ( $gname, $skip, $gnum, $gmembers ) = split m{:}; $ins_grp->execute( $srvr, $gname, $gnum ); for my $u ( split m{,}, $gmembers ) { $ins_gu->execute( $srvr, $gnum, $u ); } } close F; } closedir D; $ins_grp->finish; $ins_gu->finish; my $ins_usr = $db->prepare( "insert into srvr_usr (srvrid,usrname,usrn +um,defgrp) values (?,?,?,?)" ); chdir $usr_path or die "chdir $usr_path: $!\n"; opendir( D, "." ) or die "opendir $usr_path: $!\n"; while ( my $f = readdir( D )) { open( F, "<", $f ) or do { warn " open failed on $f: $!\n"; next } +; my ( $srvr ) = ( $f =~ /^(\w+)/ ); while (<F>) { next if ( /^#/ ); chomp; my ( $uname, $unum, $dgrp ) = split m{:}; $ins_usr->execute( $srvr, $uname, $unum, $dgrp ); } close F; } closedir D; $ins_usr->finish; for my $table ( qw/srvr_grp srvr_usr grp_usr/ ) { print "\n== selecting all from $table: ==\n"; my $qsth = $db->prepare( "select * from $table" ); $qsth->execute; my $rows = $qsth->fetchall_arrayref; for my $row ( @$rows ) { print join( "\t", @$row ), "\n"; } }
    No system calls at all, and setting up a nice set of parameterized queries (where a user just supplies a string that they're looking for in a given field) would be pretty quick and easy. Turn-around time on answering queries will be a lot faster with SQLite than what you'd get by scanning through all the group and passwd files for every query.