mikejones has asked for the wisdom of the Perl Monks concerning the following question:

Greetings respected Monks, I have a formatted file from an awk script that parsed multiple password files. The single file looks like this and I am trying to decide on what data structure to use:
/var/tmp/passwd.hostname1.platform nguyenhe 1929 20 Henry Nguyen,555-555-555 bjose 1990 20 Bobby Jose,x3338 .... .... .... /var/tmp/passwd.hostname2.platform vjain 2098 20 Vineet Kumar Jain, offshore llai 2122 20 Levius Lai bjose 1995 20 Bobby Jose,x3338 ... ... ...
to acheive my goal which is to find un-alike uids so in the above file I would print out bjose and his respected 4 fields. I was thinking of multiple hashes with multiple keys. My keys would be hostname.platform and the values would be the fields above name,uid,gid,gecos because I need to identify on what host each user-id with un-alike uid lives. Here is my code thus far, thank you!
use strict; use warnings; use diagnostics; my $awksh = qq(/home/awk_parse_passwd.ksh); my $log = qq(/tmp/uid_ck.log); my ($k,$v,$element,$line) = 0; my (@keys,@values,@glob) = (); my (%hosts,%mcg_hosts) = (); open (LOG, ">>$log") or warn "file: '$log' did not open $!"; open (AWKSH, "$awksh -|") or die "unable to spawn '$awksh' $!"; { local $/ = 'undef'; foreach $line (<AWKSH>) { (@glob) = glob("/home/passwd.*"); } } foreach $element (@glob) { ($keys[$k++]) = $element =~ m|\.(\w+\.\w+)\z|ig; }

Replies are listed 'Best First'.
Re: trying to decide best data structure for problem at hand.
by rodion (Chaplain) on Dec 30, 2006 at 05:24 UTC
    You've got some problems in the code which may be tripping you up. The loop
    { local $/ = 'undef'; foreach $line (<AWKSH>) { (@glob) = glob("/home/passwd.*"); } }
    Looks like it has several problems.
    • $/ isn't set to undef, it's set to a string containing the word 'undef'. So the code is looking for lines separated by the word 'undef'. As it turns out, there is probably no word 'undef' in the file, so it does slurp the whole file anyway, looking for the 'undef' separator, but you should fix it anyway.
    • You would set it to undef (or just leave nothing assigned to it which is the same thing) if you wanted to slurp the whole file into a single scalar variable. However you're reading it with <AWKSH> in the array part of a "for" loop, assingning it to $line on each cycle. The whole file get's slurped into the first element on the for loop, so the for loop executes just once, with the whole ouput of the pipe in $line.
    • It's good that the for loop only executes once, since assigning the same thing to @glob each time around the loop would just do the same thing over again.
    • The whole pipe output has been put in $line, but then we exit the for loop and it disapears, since its scope is limited to the for loop. (It's equivalent to "foreach my $line (<AWKSH>)"
    If you're using this code and are stucck, fixing the items above may get you moving again, along with the advice given by others on data structures.
Re: trying to decide best data structure for problem at hand.
by GrandFather (Saint) on Dec 30, 2006 at 05:07 UTC

    I'd key by name then uid:

    use strict; use warnings; my %entries; my $key; my $host; while (<DATA>) { chomp; next if ! length; if (m|^/|) { $host = $_; } elsif (/(\w+)\s+(\d+)\s+(\d+)\s+(.*)/) { my ($name, $uid, $gid, $gecos) = ($1, $2, $3, $4); $entries{$name}{$uid} = { # User data keyed by uid gid => $gid, gecos => $gecos, host => $host, }; } else { print "Can't parse >$_<\n"; } } my @unlike = grep {keys (%{$entries{$_}}) > 1} keys %entries; print "@unlike"; __DATA__ /var/tmp/passwd.hostname1.platform nguyenhe 1929 20 Henry Nguyen,555-555-555 bjose 1990 20 Bobby Jose,x3338 /var/tmp/passwd.hostname2.platform vjain 2098 20 Vineet Kumar Jain, offshore llai 2122 20 Levius Lai bjose 1995 20 Bobby Jose,x3338

    Prints:

    bjose

    DWIM is Perl's answer to Gödel
      wow cannot believe I did not see how simple it really was. Anyway thank you, but in your code you do not use
      $entries{$name}{$uid} = { # User data keyed by uid gid => $gid, gecos => $gecos, host => $host,
      gid => $gid, gecos => $gecos, host => $host ???

        It's there because in your OP you say "... I need to identify on what host each user-id ...". It's not used because I didn't need it to illustrate the main issue. However if you change the final print to:

        for my $name (@unlike) { my @hosts = map {$entries{$name}{$_}{host}} keys %{$entries{$name} +}; print "$name found on \n\t", join ("\n\t", @hosts), "\n\n"; }

        then the output is:

        bjose found on /var/tmp/passwd.hostname2.platform /var/tmp/passwd.hostname1.platform

        DWIM is Perl's answer to Gödel
Re: trying to decide best data structure for problem at hand.
by sauoq (Abbot) on Dec 30, 2006 at 05:01 UTC
    I am trying to decide on what data structure to use

    I think mine would look something like:

    { 'nguyenhe' => { 'hostname1' => '1929 20 Henry Nguyen,555-555-555', }, 'bjose' => { 'hostname1' => '1990 20 Bobby Jose,x3338', 'hostname2' => '1995 20 Bobby Jose,x3338' }, 'vjain' => { 'hostname2' => '2098 20 Vineet Kumar Jain, offshore' }, 'llai' => { 'hostname2' => '2122 20 Levius Laibjose' }, }

    -sauoq
    "My two cents aren't worth a dime.";
OT: you could use a directory too
by f00li5h (Chaplain) on Jan 04, 2007 at 11:49 UTC

    Yet another not-answering-your-perl-question, but you may be intersted totally ripping the guts out of your existing infrastructure, and changing the uid -> user mappings to be done using LDAP or Yellow Pages/NIS instead of bundles of /etc/passwds.

    Ofcourse, you may have already thought of a centralised directory and thrown the idea out because you're on many sites, don't have that many common users accross all boxes, machines run vastly different OS's, your boss won't pay you to do it, etc too

    But I did the LDAP thing once, and it was neat, slap in an export of /home, and some ssh keys, and you're away laughing. The place I'm at now uses NIS, and exported homes, but same deal.

    @_=qw; ask f00li5h to appear and remain for a moment of pretend better than a lifetime;;s;;@_[map hex,split'',B204316D8C2A4516DE];;y/05/os/&print;