sampson has asked for the wisdom of the Perl Monks concerning the following question:

Followed example online, I am trying to filter IP address from a networkctl report.
#!/usr/bin/perl use strict; use warnings; use Regexp::Common qw(net); use String::Util qw(trim); while (<>) { /$RE{net}{IPv4}{-keep}/ and printf +"%s\n", trim($1); /$RE{net}{IPv6}{-sep => ':'}{-style => 'HeX'}{-keep}/ and printf +"%s\n", trim($1); /$RE{net}{MAC}{-keep}/ and printf +"%s\n", trim($1); }
by running
cat report.txt | ./ips.pl > ips.txt
Then I used vi's dd to remove lines and got this file (remove real IPs and MAC address before posting here)
192.168.1.188 fdba:7b43:1916::d34 fe80::ec78:4ff:fec0:a17b 192.168.1.1 fe80::eade:27ff:feb6:fa8c 192.168.1.1 fdba:7b43:1916::1 fdba:7b43:1916::d34 fdba:7b43:1916::d34 fdba:7b43:1916::d34 fdba:7b43:1916::d34 fdba:7b43:1916::d34
When I further filter with GNU coreutil's uniq, I got different results from "uniq" Vs "uniq -u". I read online that this is mainly due to spaces in the data file. When I print in the perl script, I trim() already. What might went wrong in the above script?
cat perlout.txt | sort | uniq 192.168.1.1 192.168.1.188 fdba:7b43:1916::1 fdba:7b43:1916::d34 fe80::eade:27ff:feb6:fa8c fe80::ec78:4ff:fec0:a17b cat perlout.txt | sort | uniq -u 192.168.1.188 fdba:7b43:1916::1 fe80::eade:27ff:feb6:fa8c fe80::ec78:4ff:fec0:a17b When pre-filter one more time with uniq, then they give same result cat perlout.txt | sort | uniq | uniq 192.168.1.1 192.168.1.188 fdba:7b43:1916::1 fdba:7b43:1916::d34 fe80::eade:27ff:feb6:fa8c fe80::ec78:4ff:fec0:a17b cat perlout.txt | sort | uniq | uniq -u 192.168.1.1 192.168.1.188 fdba:7b43:1916::1 fdba:7b43:1916::d34 fe80::eade:27ff:feb6:fa8c fe80::ec78:4ff:fec0:a17b

Replies are listed 'Best First'.
Re: Perl output and uniq
by Fletch (Bishop) on May 01, 2020 at 19:30 UTC

    Presuming your input data's of a reasonable size you probably could skip the postprocessing and instead sort and uniq'ify your data yourself.

    #!/usr/bin/perl use strict; use warnings; use Regexp::Common qw(net); use String::Util qw(trim); my %seen_addrs; while (<>) { /$RE{net}{IPv4}{-keep}/ and $seen_a +ddrs{ trim($1) }++; /$RE{net}{IPv6}{-sep => ':'}{-style => 'HeX'}{-keep}/ and $seen_a +ddrs{ trim($1) }++; /$RE{net}{MAC}{-keep}/ and $seen_a +ddrs{ trim($1) }++; } for my $addr ( sort keys %seen_addrs ) { print $addr, qq{\n}; }

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      Thanks for your codes! I know all kinds of string processing can be done in Perl. Just that after putting it down (from entry level) for over 30 years, I need to relearn all the constructs and syntax for common operations. (File open, while loop, print with format, special variables, not counting omitted variables). And of course, all kinds of REs. Hopefully, I can got more done in Perl overtime.
Re: Perl output and uniq
by jo37 (Curate) on May 01, 2020 at 20:04 UTC

    In general, the results from "uniq" and "uniq -u" are different. "uniq | uniq" and "uniq | uniq -u" will always give the same as "uniq".

    All I can see from your examples is that "uniq" and "uniq -u" give different results. No surprise.

    I just don't see a problem with your examples. To me everything looks like it should.

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
      I am really glad to know it is due to uniq -u Vs uniq . Thank you!
Re: Perl output and uniq
by afoken (Chancellor) on May 02, 2020 at 20:49 UTC
    cat report.txt | ./ips.pl > ips.txt

    Did you know there is a Useless Use of Cat Award?

    (Just trolling.)

    But you really don't need cat here. Allow your computer to relax a bit and just use one process where one process is sufficient:

    ./ips.pl < report.txt > ips.txt

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)