ewhitt has asked for the wisdom of the Perl Monks concerning the following question:

I wrote this function and it does what I need. I was wondering if there was a more Perl'ish way to approach it? Basically, the function looks for certain headings (e.g. Router, Network, and Extern) and adds the following lines into a specific array, until a new heading is found.
sub ospf2_split_database { my ($dl, $line) = ""; my @ospf2database = @_; my (@router, @network, @external) = (); for $line (@ospf2database) { chomp($line); if ($line =~ /^Router/) { $dl = 1; } if ($line =~ /^Network/) { $dl = 2; } if ($line =~ /^Extern/) { $dl = 5; } switch ($dl) { case 1 { push(@router, "$line\n"); } case 2 { push(@network, "$line\n"); } case 5 { push(@external, "$line\n"); } } } return(\@router,\@network,\@external); }

Replies are listed 'Best First'.
Re: Is there a better way to approach this?
by FunkyMonk (Bishop) on Mar 29, 2008 at 09:40 UTC
    Use a hash of arrays
    my %matches; my $section; ... if ( $line =~ m/^(Router|Network|Extern)/ ) { $section = $1; } push @{ $matches{$section} }, $line;

    And I'd stay clear of Switch if I were you. It can lead to some difficult to trace bugs.

    update

    Here's a reimplementation of your subroutine, along with example usage

    Update^2: Now meets the clarified spec

      Thanks for the response. I forgot to mention one thing - not all the lines have what I am matching for. So once I match, I need the following lines to dump into a specific array until another match occurs, when switching to the appropriate array. Thanks!

        FunkyMonk's solution is still the right thing to do for the data structure. The only thing this new requirement adds is that you need to save every line you see somewhere; the last match determines the "where".

        Sample script:

        #!/usr/bin/perl -w use strict; my ($state, %data); for (<DATA>){ $state = $1 if /^(Router|Network|Extern)/; push @{$data{$state}}, $_; } # Display the saved data for my $st (sort keys %data){ print "$st:\n"; for (@{$data{$st}}){ print "\t$_"; } } __DATA__ Router abc def ghi jkl Network Extern foo bar qux zotz Router blah Network

        Update: I had originally thought to number the lines, so I had the innermost loop in the 'display' section traverse the indexes. Since I'm just displaying the lines, I've gone back to walking the list itself - so rather than 'for my $line (0 .. $#{$data{$st}}){ print "\t$data{$st}->[$line]"; }', that loop becomes simply 'for (@{$data{$st}}){ print "\t$_"; }'.

        Update^2: Whoops, I missed the fact that the keys were supposed to be at the beginning of the line. Fixed the regex.

        
        -- 
        Human history becomes more and more a race between education and catastrophe. -- HG Wells
        
Re: Is there a better way to approach this?
by jwkrahn (Abbot) on Mar 29, 2008 at 18:21 UTC
    sub ospf2_split_database {

        my ($dl, $line) = "";

    You shouldn't declare those variables here but inside the for loop.

        my @ospf2database = @_;
        my (@router, @network, @external) = ();

        for $line (@ospf2database) {

    You should declare $line and $dl here:

        for my $line (@ospf2database) {
            my $dl;

            chomp($line);

    Why chomp $line when you are just appending a newline again anyway?

            if ($line =~ /^Router/)  { $dl = 1; }
            if ($line =~ /^Network/) { $dl = 2; }
            if ($line =~ /^Extern/)  { $dl = 5; }
            switch ($dl) {
                case 1 { push(@router, "$line\n");   }
                case 2 { push(@network, "$line\n");  }
                case 5 { push(@external, "$line\n"); }

    Do you need two steps?   Can't you just do the push in the if block?   You are testing $line three times although if it matches /^Router/ it will never match /^Network/ or /^Extern/.

    if ( $line =~ /^Router/ ) { push @router, "$line\n" } elsif ( $line =~ /^Network/ ) { push @network, "$line\n" } elsif ( $line =~ /^Extern/ ) { push @external, "$line\n" }
            }
        }
        return(\@router,\@network,\@external);
    }

Re: Is there a better way to approach this?
by GrandFather (Saint) on Mar 30, 2008 at 03:46 UTC

    For a slightly different way to do it that is maybe more maintainable you could:

    use strict; use warnings; my @lines = <DATA>; my @records = ospf2_split_database (@lines); print @$_ for @records; sub ospf2_split_database { my @ospf2database = @_; my %data = (Router => [], Network => [], Extern => []); my $keys = join '|', keys %data; my $array; for my $line (@ospf2database) { $array = $data{$1} if $line =~ /^($keys)/; next unless defined $array; push @$array, $line; } return values %data; } __DATA__ Heading stuff that should be ignored Router: 1st router line 2nd router line this router line includes keys Network and Extern Network: 1st network record Extern: 1st extern record Extern: 2nd extern record Router: 2nd router record

    Prints:

    Network: 1st network record Router: 1st router line 2nd router line this router line includes keys Network and Extern Router: 2nd router record Extern: 1st extern record Extern: 2nd extern record

    Perl is environmentally friendly - it saves trees
      Thanks. I am able to understand your code the best, yet hashes are still confusing to me. With this said, I have a couple of questions:

      1) If you are returning "%data" (A hash), how is this being stored in "@records" (an array?)

      2) The "print @$_ for @records" prints everything fine, but how would I only print one of the three?

      Here is the real data I using.
      OSPF AS SCOPE link state database Type ID Adv Rtr Seq Age Opt Cksu +m Len Router 1.1.252.17 1.1.252.17 0x80002db7 2418 0x22 0xffa2 84 bits 0x0, link count 5 id 172.16.22.0, data 255.255.255.0, Type Stub (3) TOS count 0, TOS 0 metric 1 id 1.1.254.4, data 255.255.255.255, Type Stub (3) TOS count 0, TOS 0 metric 0 id 1.1.254.1, data 1.1.254.42, Type PointToPoint (1) TOS count 0, TOS 0 metric 65 id 1.1.254.41, data 255.255.255.255, Type Stub (3) TOS count 0, TOS 0 metric 65 id 1.1.254.42, data 255.255.255.255, Type Stub (3) TOS count 0, TOS 0 metric 0 Aging timer 00:19:42 Installed 00:40:15 ago, expires in 00:19:42, sent 00:40:13 ago Last changed 1d 00:00:21 ago, Change count: 420 Router *1.1.254.1 1.1.254.1 0x80000723 2421 0x22 0x7d4 108 bits 0x0, link count 7 id 1.1.254.59, data 1.1.254.59, Type Transit (2) TOS count 0, TOS 0 metric 1 id 1.1.254.44, data 255.255.255.252, Type Stub (3) TOS count 0, TOS 0 metric 1 id 1.1.254.1, data 255.255.255.255, Type Stub (3) TOS count 0, TOS 0 metric 0 id 1.1.254.3, data 1.1.254.33, Type PointToPoint (1) TOS count 0, TOS 0 metric 32 id 1.1.254.32, data 255.255.255.252, Type Stub (3) TOS count 0, TOS 0 metric 32 id 1.1.252.17, data 1.1.254.41, Type PointToPoint (1) TOS count 0, TOS 0 metric 65 id 1.1.254.40, data 255.255.255.252, Type Stub (3) TOS count 0, TOS 0 metric 65 Gen timer 00:09:38 Aging timer 00:19:38 Installed 00:40:21 ago, expires in 00:19:39, sent 00:40:19 ago Last changed 1d 00:00:21 ago, Change count: 609, Ours Router 1.1.254.3 1.1.254.3 0x80002dea 692 0x22 0xe24d 72 bits 0x0, link count 4 id 172.16.16.0, data 255.255.255.0, Type Stub (3) TOS count 0, TOS 0 metric 1 id 1.1.254.3, data 255.255.255.255, Type Stub (3) TOS count 0, TOS 0 metric 0 id 1.1.254.1, data 1.1.254.34, Type PointToPoint (1) TOS count 0, TOS 0 metric 32 id 1.1.254.32, data 255.255.255.252, Type Stub (3) TOS count 0, TOS 0 metric 32 Aging timer 00:48:28 Installed 00:11:29 ago, expires in 00:48:28, sent 00:11:27 ago Last changed 22:41:31 ago, Change count: 28 Router 1.1.254.57 1.1.254.57 0x80001040 23 0x22 0xe8a7 72 bits 0x2, link count 4 id 1.1.254.59, data 1.1.254.57, Type Transit (2) TOS count 0, TOS 0 metric 10 id 1.1.252.192, data 255.255.255.192, Type Stub (3) TOS count 0, TOS 0 metric 10 id 10.2.20.0, data 255.255.255.0, Type Stub (3) TOS count 0, TOS 0 metric 10 id 10.2.18.0, data 255.255.255.0, Type Stub (3) TOS count 0, TOS 0 metric 10 Aging timer 00:59:36 Installed 00:00:22 ago, expires in 00:59:37, sent 00:00:20 ago Last changed 6w3d 08:01:42 ago, Change count: 3 Extern 0.0.0.0 1.1.254.57 0x8000101d 23 0x20 0xc0b2 + 36 mask 0.0.0.0 Type 2, TOS 0x0, metric 1, fwd addr 0.0.0.0, tag 0.0.126.75 Aging timer 00:59:36 Installed 00:00:22 ago, expires in 00:59:37, sent 00:00:20 ago Last changed 6w3d 08:02:28 ago, Change count: 1

        The return line is actually:

        return values %data;

        which returns just the values (not key => value pairs) of the hash. @records contains up with a number of array references, one for each record type.

        print @{$records[1]};

        prints the second of the three records. Note that the @{...} 'casts' the array reference returned by $records[1] into an array.

        You may find perlref and perllol helpful.


        Perl is environmentally friendly - it saves trees