tweetiepooh has asked for the wisdom of the Perl Monks concerning the following question:

I have a task to take a netgroup file from a unix server and to create a directory tree from the data. This tree is how our admins then recreate the netgroup for NIS on Solaris.

Creating the tree I've just about figured out, probably not the best solution but I think it works. The problem now is taking the data from the file and writing it out to a set of flat files in this tree.

The data looks like

(host,,) (host-1,,) (host-1.domain,,) (host_234,,) or (,user,) (,user,)
There maybe one or more data entry on each line.

I need to convert this to

data data data ...
Now using reg-ex coach on a PC I came up with
s/\W*([\w._-]+)\W+/$1\n/g
But this only gets the first data item in Perl.

A pointer would be jolly helpful. I've searched a bit but can't quite see what I need.

Replies are listed 'Best First'.
Re: Process netgroup file with a regex
by Velaki (Chaplain) on Aug 23, 2006 at 16:57 UTC

    The trick is taking into account the possibility of a circular reference in the source data, since netgroups may overlap and nest in many, many ways.

    Pulling the data into a structure with a regex isn't as hard and deciding what kind of tree to use, and even more importantly, how you wish to represent the tree in a flat file format. Remember, you have a netgroup alias that maps to a collection of triples, which represent a host, a user, and a domain. Also, you may nest netgroups within netgroup definitions.

    I don't know what data structure you want, but maybe this snippet will help with starting to parse the file.

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $curr_grp; my (%host, %user, %domain); while (my $line = <DATA>) { if ($line =~ /^\s*(?!<\()([\w-]+)(?!=\))/) { $curr_grp = $1; } if ($curr_grp) { my @entries; push @entries, $1 while $line =~ /\((.*?)\)/g; for my $entry (@entries) { my ($host, $user, $domain) = split /,/, $entry; push @{$host{$curr_grp}} , $host if $host ; push @{$user{$curr_grp}} , $user if $user ; push @{$domain{$curr_grp}}, $domain if $domain; } } } print Dumper \%host; print Dumper \%user; print Dumper \%domain; __DATA__ nifty-group (host,,zanzibar.org) (host-1,,) (host-1.domain,,) (host_234,,) (foo,,) (host-1,phil,) (host-1.domain,,) (host_234,,) other-group (h2,,) (h6,,)

    Hope this helped a little,
    -v.

    Update:Change code snippet to extract the data closer to what you want. Here's the output.

    $VAR1 = { 'nifty-group' => [ 'host', 'host-1', 'host-1.domain', 'host_234', 'foo', 'host-1', 'host-1.domain', 'host_234' ], 'other-group' => [ 'h2', 'h6' ] }; $VAR1 = { 'nifty-group' => [ 'phil' ] }; $VAR1 = { 'nifty-group' => [ 'zanzibar.org' ] };

    "Perl. There is no substitute."
      Thanks for the pointers.

      I know about the circular refs and the file format. etc, I am just trying to convert the data (triplets) into a CR delimited list. So

      (host,,) (host-1,,) (host-1.domain,,) (host_234,,)
      becomes
      host host-1 host-1.domain host_234
      I would ideally like to handle user entries (,user,) in the same code as I would already be pointing to a user file, we don't use domain entries and don't mix the data.

        I updated my code snippet, above. I think it's more in keeping with what you want. You'll still have to output the hashes, but that should be easy enough. Their keys are the alias names for the netgroup being scanned, and the values are refs to an array of names (host, user, or domain).

        Hope this helped
        -v.

        "Perl. There is no substitute."