Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a loop that goes trough a list of records but it jumps every two records

This the file:
Relay access denied (total: 13) 1 46.183.217.174 1 46.183.220.137 1 46.183.220.138 1 46.183.220.139 1 46.183.223.239 1 218.38.243.204 1 185.29.8.198 1 185.29.9.133 1 185.29.9.135 1 46.183.217.162 1 46.183.217.165 1 46.183.217.169 1 91.236.75.169 Sender address rejected: Access denied (total: 50)
And this is my code
use strict; use warnings; my $ip=""; while (<>) { if (/Relay.access.denied/) { do { print "$2\n" if <> =~ /(\d+)\s+(\S+)/; $ip = $2; } until (<> !~ /(\d+)\s+(\S+)/); } }
The output I get is :
46.183.217.174 46.183.220.138 46.183.223.239 185.29.8.198 185.29.9.135 46.183.217.165 91.236.75.169

Replies are listed 'Best First'.
Re: Loop problem: jumps one record
by hippo (Archbishop) on Jan 31, 2017 at 10:48 UTC

    Each of the 2 lines where you test against the regex is reading another record. Don't do that. Instead, you could store the record in a variable and then do the two tests on that one variable. eg:

    use strict; use warnings; my $ip = ""; while (<>) { if (/Relay.access.denied/) { my $rec; do { $rec = <>; print "$2\n" if $rec =~ /(\d+)\s+(\S+)/; $ip = $2; } until ($rec !~ /(\d+)\s+(\S+)/); } }

    However, there are so many other ways to approach this too. It would be better to avoid the regex duplication in the first place, for example.

    Addendum: As you appear to be trying to parse the output of pflogsumm, perhaps this is an XY problem?

      Yes indeed I'm trying to read from pflogsumm's output. The purpose is to read addresses from it and add them to a local database.

        In that case there is a limited number of approaches which you could take to this. I can immediately think of these three.

        1. Use your current strategy to parse the pflogsumm output. This means that you can leverage the options to pflogsumm to help you structure the data in the most useful way to you. Be aware that it is prone to breakage if the output format ever changes.
        2. Modify (a copy of) pflogsumm so that it either outputs precisely what you want or even performs the whole task for you including the database insertion. pflogsumm is written in Perl which makes this option less daunting that it would otherwise be.
        3. Ignore pflogsumm and read the mail log directly. There is Mail::Log::Parse::Postfix which might help with this (I've not used it so YMMV).

        Good luck with your project.

Re: Loop problem: jumps one record (updated)
by haukex (Archbishop) on Jan 31, 2017 at 11:03 UTC

    Hi Anonymous,

    The problem is that every time you write <>, a new record is read from the file. while (<>) { ... } will assign the current record to the $_ variable, so inside the loop you should just use that instead of <>. Note that when you write a regex without the =~ operator, it will implicitly be tested against the $_ variable.

    One way to do this is a state machine type approach, this has the advantages that you only read from <> once, it has no nested loops, and it's very extensible.

    use warnings; use strict; use constant { IDLE => 0, IP_ADDRS => 1, }; my $state = IDLE; while (<>) { if ($state == IDLE) { if (/Relay.access.denied/) { $state = IP_ADDRS } } elsif ($state == IP_ADDRS) { if (/(\d+)\s+(\S+)/) { print "$2\n" } else { $state = IDLE } } }

    Of course, the nested loops approach is much shorter in this case, I just personally wouldn't do this because if your code gets longer, it will get more complicated and much harder to maintain than the above approach.

    use warnings; use strict; while (<>) { if (/Relay.access.denied/) { while (<>) { /(\d+)\s+(\S+)/ or last; print "$2\n"; } } }

    Update: For completeness, note that it's also possible to invert the structure of the conditions in the state machine approach, which allows for a bit more error checking. Or, if you don't care about those conditions, they can be removed from the following code to make it shorter (about as long as the above state machine code). You can choose whether you want to use the above approach or the following one based on which one makes your logic easier to express. In this case it's roughly equivalent, but if your code gets longer that might change.

    use warnings; use strict; use constant { IDLE => 0, IP_ADDRS => 1, }; my $state = IDLE; while (<>) { chomp; if (/Relay.access.denied/) { if ($state == IDLE) { $state = IP_ADDRS } elsif ($state == IP_ADDRS) { warn "unexpected: $_" } } elsif (/(\d+)\s+(\S+)/) { if ($state == IDLE) { warn "unexpected: $_" } elsif ($state == IP_ADDRS) { print "$2\n" } } else { if ($state == IDLE) { } elsif ($state == IP_ADDRS) { $state = IDLE } } }

    Hope this helps,
    -- Hauke D

      Thank you for your input ! I just started learning Perl yesterday, so the "state machine type approach" is still new to me. I just want to add that I need to store the $2 in a variable because I then add it to a database. I also run into another problem : the loop skips the second round of entries as there are a bunch of "/Relay.access.denied/" in the file. Which I'm guessing is because the "<>" already goes through it at the end of the nested loop. So my question is, is there a way to tell <> to go back one line at the end of the loop.
        is there a way to tell <> to go back one line at the end of the loop is the wrong question agent Spooner..

        Why you want to force an iterator to go back?

        Another approach is better: something is true after Relay access denied is encountered and become false when Sender address rejected is reached (hoping thi is your case).

        Perl offer you a funny named flip-flop operator (see my recent post about it for links and explaination).

        The following short program it is nothing more that: if we are between a START and STOP sentence, print the IP if you find an IP.

        use strict; use warnings; while (<DATA>){ if (/^Relay access denied/ .. /Sender address rejected/){ print "$2\n" if $_ =~ /(\d+)\s+(\S+)/; } } __DATA__ Relay access denied (total: 2) 1 111.111.111.111 1 222.222.222.222 1 333.333.333.333 Sender address rejected: Access denied (total: 50) 1 200.200.200.200 Relay access denied (total: 1) 1 255.255.255.255 Sender address rejected: Access denied (total: 50) ##OUTPUT 111.111.111.111 222.222.222.222 333.333.333.333 255.255.255.255

        As you posted as anonymous your first post i missed the opportunity to express you a warm welcome to the monastery and to the wonderful world of Perl math&ing001 !

        In addition the special token DATA is very useful to embed some data example to your program: it is described in perldata in the section Special Literals

        Regexp::Common::net is a convenient module to match IPs: you see above the regex match 333.333.333.333 as valid IP. As side note do not put real IP data on the Net: use always fake one as iI did.

        L*

        There are no rules, there are no thumbs..
        Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

        Hi math&ing001,

        I also run into another problem : the loop skips the second round of entries as there are a bunch of "/Relay.access.denied/" in the file. ... So my question is, is there a way to tell <> to go back one line at the end of the loop.

        There isn't a good way to do that, however, in the code examples I showed, it is of course possible to do multiple things on each line of input, which is how I would approach the solution. However, it's hard for me to fully understand the problems you described, it would be helpful if you could provide more short sample input that illustrates the problem along with the output you are expecting for that sample input. See also Short, Self-Contained, Correct Example and How do I post a question effectively?

        Regards,
        -- Hauke D

Re: Loop problem: jumps one record
by Discipulus (Canon) on Jan 31, 2017 at 10:48 UTC
    I think you are overcomplicating a simple task: infact i suppose (at first glance) that your double use of the diamond <> operator is the cause of your output (PS infact every call of <> consume one line of the filehandle).

    Why do not simply (untested):

    while (<>){ if (/^Relay access denied/){ foreach (<>){ print "$2\n" if $_ =~ /(\d+)\s+(\S+)/; } } }

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Loop problem: jumps one record
by johngg (Canon) on Jan 31, 2017 at 12:51 UTC

    Use an array reference as a switch to point to whichever array is appropriate depending on the section title. Then there is no need to back up a line.

    use strict; use warnings; use feature qw{ say }; open my $inFH, q{<}, \ <<__EOD__ or die $!; Relay access denied (total: 5) 1 46.183.217.174 1 46.183.220.137 1 46.183.220.138 1 46.183.220.139 1 46.183.223.239 Sender address rejected: Access denied (total: 5) 1 47.183.223.239 1 219.38.243.204 1 183.29.8.198 1 183.29.9.133 1 183.29.9.135 Relay access denied (total: 8) 1 218.38.243.204 1 185.29.8.198 1 185.29.9.133 1 185.29.9.135 1 46.183.217.162 1 46.183.217.165 1 46.183.217.169 1 91.236.75.169 __EOD__ my @rad; my @sad; my $raSwitch; while ( <$inFH> ) { if ( m{\s+\d+\s+(\S+)} ) { push @{ $raSwitch }, $1; } elsif ( m{Relay access denied} ) { $raSwitch = \ @rad; } elsif ( m{Sender address rejected} ) { $raSwitch = \ @sad; } else { warn qq{Line not recognised: $_}; } } say q{Relay access denied:}; say qq{ $_} for @rad; say q{Sender address rejected:}; say qq{ $_} for @sad;

    The output.

    Relay access denied: 46.183.217.174 46.183.220.137 46.183.220.138 46.183.220.139 46.183.223.239 218.38.243.204 185.29.8.198 185.29.9.133 185.29.9.135 46.183.217.162 46.183.217.165 46.183.217.169 91.236.75.169 Sender address rejected: 47.183.223.239 219.38.243.204 183.29.8.198 183.29.9.133 183.29.9.135

    I hope this is helpful.

    Cheers,

    JohnGG

Re: Loop problem: jumps one record
by math&ing001 (Novice) on Jan 31, 2017 at 13:29 UTC
    Wow, I didn't expect such feedback from everyone.
    Thank you all for the warm welcome and for your constructive responses !
Re: Loop problem: jumps one record
by Anonymous Monk on Jan 31, 2017 at 10:54 UTC
    It was indeed the double use of <> Thank you both !
      You are welcome 1!