in reply to Loop problem: jumps one record

Hi Anonymous,

The problem is that every time you write <>, a new record is read from the file. while (<>) { ... } will assign the current record to the $_ variable, so inside the loop you should just use that instead of <>. Note that when you write a regex without the =~ operator, it will implicitly be tested against the $_ variable.

One way to do this is a state machine type approach, this has the advantages that you only read from <> once, it has no nested loops, and it's very extensible.

use warnings; use strict; use constant { IDLE => 0, IP_ADDRS => 1, }; my $state = IDLE; while (<>) { if ($state == IDLE) { if (/Relay.access.denied/) { $state = IP_ADDRS } } elsif ($state == IP_ADDRS) { if (/(\d+)\s+(\S+)/) { print "$2\n" } else { $state = IDLE } } }

Of course, the nested loops approach is much shorter in this case, I just personally wouldn't do this because if your code gets longer, it will get more complicated and much harder to maintain than the above approach.

use warnings; use strict; while (<>) { if (/Relay.access.denied/) { while (<>) { /(\d+)\s+(\S+)/ or last; print "$2\n"; } } }

Update: For completeness, note that it's also possible to invert the structure of the conditions in the state machine approach, which allows for a bit more error checking. Or, if you don't care about those conditions, they can be removed from the following code to make it shorter (about as long as the above state machine code). You can choose whether you want to use the above approach or the following one based on which one makes your logic easier to express. In this case it's roughly equivalent, but if your code gets longer that might change.

use warnings; use strict; use constant { IDLE => 0, IP_ADDRS => 1, }; my $state = IDLE; while (<>) { chomp; if (/Relay.access.denied/) { if ($state == IDLE) { $state = IP_ADDRS } elsif ($state == IP_ADDRS) { warn "unexpected: $_" } } elsif (/(\d+)\s+(\S+)/) { if ($state == IDLE) { warn "unexpected: $_" } elsif ($state == IP_ADDRS) { print "$2\n" } } else { if ($state == IDLE) { } elsif ($state == IP_ADDRS) { $state = IDLE } } }

Hope this helps,
-- Hauke D

Replies are listed 'Best First'.
Re^2: Loop problem: jumps one record (updated)
by math&ing001 (Novice) on Jan 31, 2017 at 11:59 UTC
    Thank you for your input ! I just started learning Perl yesterday, so the "state machine type approach" is still new to me. I just want to add that I need to store the $2 in a variable because I then add it to a database. I also run into another problem : the loop skips the second round of entries as there are a bunch of "/Relay.access.denied/" in the file. Which I'm guessing is because the "<>" already goes through it at the end of the nested loop. So my question is, is there a way to tell <> to go back one line at the end of the loop.
      is there a way to tell <> to go back one line at the end of the loop is the wrong question agent Spooner..

      Why you want to force an iterator to go back?

      Another approach is better: something is true after Relay access denied is encountered and become false when Sender address rejected is reached (hoping thi is your case).

      Perl offer you a funny named flip-flop operator (see my recent post about it for links and explaination).

      The following short program it is nothing more that: if we are between a START and STOP sentence, print the IP if you find an IP.

      use strict; use warnings; while (<DATA>){ if (/^Relay access denied/ .. /Sender address rejected/){ print "$2\n" if $_ =~ /(\d+)\s+(\S+)/; } } __DATA__ Relay access denied (total: 2) 1 111.111.111.111 1 222.222.222.222 1 333.333.333.333 Sender address rejected: Access denied (total: 50) 1 200.200.200.200 Relay access denied (total: 1) 1 255.255.255.255 Sender address rejected: Access denied (total: 50) ##OUTPUT 111.111.111.111 222.222.222.222 333.333.333.333 255.255.255.255

      As you posted as anonymous your first post i missed the opportunity to express you a warm welcome to the monastery and to the wonderful world of Perl math&ing001 !

      In addition the special token DATA is very useful to embed some data example to your program: it is described in perldata in the section Special Literals

      Regexp::Common::net is a convenient module to match IPs: you see above the regex match 333.333.333.333 as valid IP. As side note do not put real IP data on the Net: use always fake one as iI did.

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

      Hi math&ing001,

      I also run into another problem : the loop skips the second round of entries as there are a bunch of "/Relay.access.denied/" in the file. ... So my question is, is there a way to tell <> to go back one line at the end of the loop.

      There isn't a good way to do that, however, in the code examples I showed, it is of course possible to do multiple things on each line of input, which is how I would approach the solution. However, it's hard for me to fully understand the problems you described, it would be helpful if you could provide more short sample input that illustrates the problem along with the output you are expecting for that sample input. See also Short, Self-Contained, Correct Example and How do I post a question effectively?

      Regards,
      -- Hauke D

        Hi Hauke, This is an example of the file format:
        blocked using dummy1 (total: 6) 3 rzxwrvxk.com 1 correio.biz 1 facebook.com 1 213.183.58.6 blocked using dummy2 (total: 330) 2 118.125.110.201 1 61.2.46.20 blocked using dummy3 (total: 5) 2 hinet.net 1 219.140.15.195 1 219.139.16.134 1 222.189.112.23 blocked using dummy4 (total: 53) 5 66.23.212.67 5 66.23.212.70 4 66.23.212.68 4 66.23.212.69 Relay access denied (total: 13) 1 46.183.217.174 1 46.183.220.137 1 46.183.220.138
        Code:
        use strict; use warnings; my $ip=""; my $io; while (<>) { if (/blocked.using/) { do { $io = <>; $ip = $2 if $io =~ /(\d+)\s+(\S+)/; print $ip; #adding to database; } until ($io !~ /(\d+)\s+(\S+)/); } if (/Relay.access.denied/) { do { $io = <>; $ip = $2 if $io =~ /(\d+)\s+(\S+)/; print $ip; # adding to database; } until ($io !~ /(\d+)\s+(\S+)/); } }
        Output I get :
        rzxwrvxk.com correio.biz facebook.com 213.183.58.6 213.183.58.6 hinet.net 219.140.15.195 219.139.16.134 222.189.112.23 222.189.112.23 46.183.217.174 46.183.220.137 46.183.220.138