Perl output and uniq

sampson has asked for the wisdom of the Perl Monks concerning the following question:

Followed example online, I am trying to filter IP address from a networkctl report.

#!/usr/bin/perl
use strict;
use warnings;
use Regexp::Common qw(net);
use String::Util qw(trim);

while (<>) {
    /$RE{net}{IPv4}{-keep}/                                and printf 
+"%s\n", trim($1);
    /$RE{net}{IPv6}{-sep => ':'}{-style => 'HeX'}{-keep}/  and printf 
+"%s\n", trim($1);
    /$RE{net}{MAC}{-keep}/                                 and printf 
+"%s\n", trim($1);
}
[download]

by running

 
cat report.txt | ./ips.pl > ips.txt
[download]

Then I used vi's dd to remove lines and got this file (remove real IPs and MAC address before posting here)

192.168.1.188
fdba:7b43:1916::d34
fe80::ec78:4ff:fec0:a17b
192.168.1.1
fe80::eade:27ff:feb6:fa8c
192.168.1.1
fdba:7b43:1916::1
fdba:7b43:1916::d34
fdba:7b43:1916::d34
fdba:7b43:1916::d34
fdba:7b43:1916::d34
fdba:7b43:1916::d34
[download]

When I further filter with GNU coreutil's uniq, I got different results from "uniq" Vs "uniq -u". I read online that this is mainly due to spaces in the data file. When I print in the perl script, I trim() already. What might went wrong in the above script?

cat perlout.txt | sort | uniq
192.168.1.1
192.168.1.188
fdba:7b43:1916::1
fdba:7b43:1916::d34
fe80::eade:27ff:feb6:fa8c
fe80::ec78:4ff:fec0:a17b

cat perlout.txt | sort | uniq -u
192.168.1.188
fdba:7b43:1916::1
fe80::eade:27ff:feb6:fa8c
fe80::ec78:4ff:fec0:a17b

When pre-filter one more time with uniq, then they give same result
cat perlout.txt | sort | uniq | uniq
192.168.1.1
192.168.1.188
fdba:7b43:1916::1
fdba:7b43:1916::d34
fe80::eade:27ff:feb6:fa8c
fe80::ec78:4ff:fec0:a17b

cat perlout.txt  | sort | uniq | uniq -u
192.168.1.1
192.168.1.188
fdba:7b43:1916::1
fdba:7b43:1916::d34
fe80::eade:27ff:feb6:fa8c
fe80::ec78:4ff:fec0:a17b
[download]

Comment on Perl output and uniq Select or Download Code

Replies are listed 'Best First'.
Re: Perl output and uniq by Fletch (Bishop) on May 01, 2020 at 19:30 UTC
Presuming your input data's of a reasonable size you probably could skip the postprocessing and instead sort and uniq'ify your data yourself. `#!/usr/bin/perl use strict; use warnings; use Regexp::Common qw(net); use String::Util qw(trim); my %seen_addrs; while (<>) { /$RE{net}{IPv4}{-keep}/ and $seen_a +ddrs{ trim($1) }++; /$RE{net}{IPv6}{-sep => ':'}{-style => 'HeX'}{-keep}/ and $seen_a +ddrs{ trim($1) }++; /$RE{net}{MAC}{-keep}/ and $seen_a +ddrs{ trim($1) }++; } for my $addr ( sort keys %seen_addrs ) { print $addr, qq{\n}; }` [download] The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l]
Re^2: Perl output and uniq by sampson (Initiate) on May 02, 2020 at 12:06 UTC
Thanks for your codes! I know all kinds of string processing can be done in Perl. Just that after putting it down (from entry level) for over 30 years, I need to relearn all the constructs and syntax for common operations. (File open, while loop, print with format, special variables, not counting omitted variables). And of course, all kinds of REs. Hopefully, I can got more done in Perl overtime.	[reply]
Re: Perl output and uniq by jo37 (Curate) on May 01, 2020 at 20:04 UTC
In general, the results from "`uniq`" and "`uniq -u`" are different. "`uniq \| uniq`" and "`uniq \| uniq -u`" will always give the same as "`uniq`". All I can see from your examples is that "`uniq`" and "`uniq -u`" give different results. No surprise. I just don't see a problem with your examples. To me everything looks like it should. Greetings, -jo `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply] [d/l] [select]
Re^2: Perl output and uniq by sampson (Initiate) on May 02, 2020 at 12:07 UTC
I am really glad to know it is due to uniq -u Vs uniq . Thank you!	[reply]
Re: Perl output and uniq by afoken (Chancellor) on May 02, 2020 at 20:49 UTC
`cat report.txt \| ./ips.pl > ips.txt` [download] Did you know there is a Useless Use of Cat Award? (Just trolling.) But you really don't need cat here. Allow your computer to relax a bit and just use one process where one process is sufficient: `./ips.pl < report.txt > ips.txt` [download] Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply] [d/l] [select]