Re^2: Sort/Uniq Help

Thanks for the quick replies. Here is the code that I have at the moment. This is currently working and I get the desired results (except for the fact that I get duplicates). I have also updated the code with some of your recommendations.

#!/usr/bin/perl
use strict;
use warnings;

use File::Find;
my $DIRECTORY = "/Users/data/";
find(\&edits, $DIRECTORY);
sub edits()
{
 if ( -f and /.txt$/ ) {  #Find files ending in .txt and drill down al
+l sub dirs
   my $TEXT_FILE = $_; #save the results to $_;
   open MATCHING_FILE, $TEXT_FILE;
   my @all_lines = <MATCHING_FILE>; #Place everything into an array ca
+ll all_lines
   close MATCHING_FILE;
   for my $each_line ( @all_lines ) {
     if ( $each_line =~ /[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}|pa
+ssword|(ssn=)/i ) { #Search for IP or password or ssn=
       #print $each_line, "Found in $File::Find::name\n";
       print $each_line; # Print each line that is found
       }
     }
   }
 }
[download]

Results from all files in a directory
192.168.1.1
192.168.1.1
192.168.1.1
64.22.34.66
221.245.23.44
PASSWORD=FpnmRjE
[download]

what I want to do is remove the 192.168.1.1 dups from being printed along with all the other dups that show up.

Comment on Re^2: Sort/Uniq Help Select or Download Code

Replies are listed 'Best First'.
Re^3: Sort/Uniq Help by Corion (Patriarch) on Mar 17, 2008 at 18:13 UTC
This is a faq. See perldoc4, near unique. I also find the following entries when googling for perl unique: Recipe 4.6. Extracting Unique Elements from a List (in the Perl Cookbook, no link from here) How can I extract just the unique elements of an array? ... and some more. I wonder whether you've seen these links and what parts of the solutions presented there you had problems with.	[reply]
Re^3: Sort/Uniq Help by halfcountplus (Hermit) on Mar 17, 2008 at 23:53 UTC
one quick note glancing at this: use /\.txt$/ (not /.txt$/) to be exact... altho the latter will probably work in this context, consider the difference between \. and .	[reply]
Re^3: Sort/Uniq Help by poolpi (Hermit) on Mar 18, 2008 at 09:48 UTC
You can simplify your long regexp with : `use Regexp::Common qw /net/; /\A $RE{net}{IPv4} \| password \| (ssn=) \z/xmi;` [download] hth, PooLpi 'Ebry haffa hoe hab im tik a bush'. Jamaican proverb Update: Oops,if it's a succession of alternatives : moritz++ ;) I also forgot the /m	[reply] [d/l]
Re^4: Sort/Uniq Help by moritz (Cardinal) on Mar 18, 2008 at 10:00 UTC
I know that TheDamian recommends character classes to escape chars in regexes (in PBP), but it's generally a bad idea because it will disable some optimizations (at least in older versions of perl, don't know about current ones). Also `\\|` is shorten than `[\|]`, and thus less noise that your brain has to parse. But in the original post the `\|` isn't escaped at all, so you're actually modifiying the behaviour of the regex.	[reply] [d/l] [select]
Re^5: Sort/Uniq Help by poolpi (Hermit) on Mar 18, 2008 at 13:47 UTC
By curiosity : This is perl, v5.8.8 built for x86_64-linux-gnu-thread-multi #!/usr/bin/perl use strict; use warnings; use Regexp::Common qw /net/; use Benchmark qw( cmpthese ); my $line = q{127.0.0.1}; cmpthese -10, { RE => '$line =~ /\A $RE{net}{IPv4} [\|] password [\|] (ssn=) \z/xmi' +, RE_O => '$line =~ /\A $RE{net}{IPv4} [\|] password [\|] (ssn=) \z/xm +io', ORIG => '$line =~ /[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\\|pas +sword\\|(ssn=)/i', RE_CHAR => 'use charnames qw( :full); $line =~ /\A $RE{net}{IPv4} \N{LINE TABULATION} password \N{LINE TABULATION} (ssn=) \z/xmi' }; [download] `Rate RE_CHAR RE RE_O ORIG RE_CHAR 17366/s -- -2% -2% -100% RE 17704/s 2% -- -0% -100% RE_O 17747/s 2% 0% -- -100% ORIG 12717477/s 73132% 71732% 71561% --` [download] PooLpi 'Ebry haffa hoe hab im tik a bush'. Jamaican proverb	[reply] [d/l] [select]
Re^6: Sort/Uniq Help by moritz (Cardinal) on Mar 18, 2008 at 14:37 UTC
Re^7: Sort/Uniq Help by poolpi (Hermit) on Mar 20, 2008 at 09:36 UTC