Sort/Uniq Help

learningperl01 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Sort/Uniq Help by moritz (Cardinal) on Mar 17, 2008 at 16:59 UTC
`use strict; use warnings; ... open MATCHING_FILE, $TEXT_FILE; my %seen; while (my $file = <MATCHING_FILE>){ chomp $file; if ( $file =~ /[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\|password +\|(ssn=)/i){ $seen{$file}++; } } print "$_\n" for keys %seen;` [download] Your code won't work because you'r asigning to %hashTemp for each iteration, thus deleting all previous entries. Update: fixed copy&paste error, Roy Johnson++	[reply] [d/l]
Re: Sort/Uniq Help by kyle (Abbot) on Mar 17, 2008 at 17:04 UTC
I can't tell what you're trying to do here, so it's hard to say what to suggest. However, a couple of things stand out. `@results = $each_line; %hashTemp = map { @results => 1 } @results;` [download] The first line there is trying to put a scalar into an array. You'll have an array with one element, which may not be what you want. If you want to add to the array, look at push and unshift. The second line seems to be trying to get the unique elements from `@results`, but the map block is wrong for that. This may be what you're shooting for: `push @results, $each_line; %hashTemp = map { $_ => 1 } @results;` [download] However, I'm guessing that everything after the `push` should be outside the `for` loop, and maybe outside `sub edits`. That depends on what you're ultimately trying to accomplish. Also, I think `sub edits()` could be `sub edits`. The former creates a sub with a prototype, and you probably don't want that. Finally, I think it would be a really good idea to use strict and warnings.	[reply] [d/l] [select]
Re: Sort/Uniq Help by Roy Johnson (Monsignor) on Mar 17, 2008 at 17:08 UTC
There are a lot of weird things in that code, which makes it hard to figure out how to fix it. If you'd put comments in to indicate what you expect each line or tightly-related group of lines to do, it would be easier. It might also cause you to see some things that don't make sense. I think, at least, that you want `@results = $each_line` to be `push @results, $each_line`. And `%array_out` should be `@array_out`. All the hash processing most likely goes after the `for` loop. `use strict; use warnings;` [download] would be your friends here, as well. Caution: Contents may have been coded under pressure.	[reply] [d/l] [select]
Re^2: Sort/Uniq Help by learningperl01 (Beadle) on Mar 17, 2008 at 18:09 UTC
Thanks for the quick replies. Here is the code that I have at the moment. This is currently working and I get the desired results (except for the fact that I get duplicates). I have also updated the code with some of your recommendations. #!/usr/bin/perl use strict; use warnings; use File::Find; my $DIRECTORY = "/Users/data/"; find(\&edits, $DIRECTORY); sub edits() { if ( -f and /.txt$/ ) { #Find files ending in .txt and drill down al +l sub dirs my $TEXT_FILE = $_; #save the results to $_; open MATCHING_FILE, $TEXT_FILE; my @all_lines = <MATCHING_FILE>; #Place everything into an array ca +ll all_lines close MATCHING_FILE; for my $each_line ( @all_lines ) { if ( $each_line =~ /[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\|pa +ssword\|(ssn=)/i ) { #Search for IP or password or ssn= #print $each_line, "Found in $File::Find::name\n"; print $each_line; # Print each line that is found } } } } [download] `Results from all files in a directory 192.168.1.1 192.168.1.1 192.168.1.1 64.22.34.66 221.245.23.44 PASSWORD=FpnmRjE` [download] what I want to do is remove the 192.168.1.1 dups from being printed along with all the other dups that show up.	[reply] [d/l] [select]
Re^3: Sort/Uniq Help by Corion (Patriarch) on Mar 17, 2008 at 18:13 UTC
This is a faq. See perldoc4, near unique. I also find the following entries when googling for perl unique: Recipe 4.6. Extracting Unique Elements from a List (in the Perl Cookbook, no link from here) How can I extract just the unique elements of an array? ... and some more. I wonder whether you've seen these links and what parts of the solutions presented there you had problems with.	[reply]
Re^3: Sort/Uniq Help by halfcountplus (Hermit) on Mar 17, 2008 at 23:53 UTC
one quick note glancing at this: use /\.txt$/ (not /.txt$/) to be exact... altho the latter will probably work in this context, consider the difference between \. and .	[reply]
Re^3: Sort/Uniq Help by poolpi (Hermit) on Mar 18, 2008 at 09:48 UTC
You can simplify your long regexp with : `use Regexp::Common qw /net/; /\A $RE{net}{IPv4} \| password \| (ssn=) \z/xmi;` [download] hth, PooLpi 'Ebry haffa hoe hab im tik a bush'. Jamaican proverb Update: Oops,if it's a succession of alternatives : moritz++ ;) I also forgot the /m	[reply] [d/l]
Re^4: Sort/Uniq Help by moritz (Cardinal) on Mar 18, 2008 at 10:00 UTC
Re^5: Sort/Uniq Help by poolpi (Hermit) on Mar 18, 2008 at 13:47 UTC
Some notes below your chosen depth have not been shown here