Getting unique data from search.

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I need to get unique web titles for my web sites AFTER doing my File::Find search. Below I get ALL Titles but just need the unique names. So If I have web title with the name "Index Home" I get hundreds of "Index Home" titles from my file find. I really only need one output or match of "Index Home" then move onto next unique output and so on. So below script prints:

Title = Index Home
Title = Index Home
Title = Index Home
Title = Index Home 
Title =  Home Page
Title =  Next Page
Title =  Next Page
Title =  Next Page
[download]

I really need the output to show distinct output of the results such as:

Title = Index Home
Title = Home Page
Title = Next Page
[download]

Here is my attempt. Can someone advise how I can do this???

use File::Find;
sub wanted
{
 if( $_ =~ /\.html?$/)
 {
   my $name = $File::Find::name;
   open ( F, $name ) or die "$!: $name\n";
   while($line = <F>)
   {
      if($line =~ /<title>(.+)<\/title>/i)
      {
      print "Title = $1\n";
      }
   }
close F;
 }
}
find( \&wanted, "/unixpath/webfiles" );
[download]

Comment on Getting unique data from search. Select or Download Code

Replies are listed 'Best First'.
Re: Getting unique data from search. by Wonko the sane (Deacon) on Nov 14, 2002 at 14:43 UTC
Try declaring a %seen hash outside your sub, and change the print line to this. That should give you what youa re looking for. `print "Title = $1\n" unless $seen{$1}++;` [download] P.S. Doesnt look like you are using strict. Wonko	[reply] [d/l]
Re: Re: Getting unique data from search. by Anonymous Monk on Nov 14, 2002 at 14:55 UTC
this still not working, any suggestions? `use File::Find; sub wanted { if( $_ =~ /\.html?$/) { my $name = $File::Find::name; open ( F, $name ) or die "$!: $name\n"; while($line = <F>) { if($line =~ /<title>(.+)<\/title>/i) { print "Title = $1\n"; } } close F; } } find( \&wanted, "/unixpath/webfiles" ); print "title = $1\n" unless $seen{$1}++;` [download]	[reply] [d/l]
Re: Re: Re: Getting unique data from search. by jdporter (Paladin) on Nov 14, 2002 at 16:30 UTC
You didn't follow his advice; he said "change the print statement." Not "add this at the bottom." The code should look like this: `use strict; use File::Find; my %seen; sub wanted { if ( /\.html$/) { open F, "< $File::Find:name" or die "read $File::Find::name: $!\n"; local $_; while (<F>) { if ( /<title>(.?)<\/title>/si ) { print "Title = $1\n" unless $seen{$1}++; } } close F; } } find( \&wanted, "/unixpath/webfiles" );` [download] jdporter ...porque es dificil estar guapo y blanco.*	[reply] [d/l]
Re: Re: Re: Re: Getting unique data from search. by Anonymous Monk on Nov 14, 2002 at 19:41 UTC
Re: Re: Re: Re: Re: Getting unique data from search. by jdporter (Paladin) on Nov 14, 2002 at 20:12 UTC
Re: Getting unique data from search. by Callum (Chaplain) on Nov 14, 2002 at 14:42 UTC
Store all the titles you get in a hash then suck them out with keys.	[reply]
Re: Getting unique data from search. by Three (Pilgrim) on Nov 14, 2002 at 16:30 UTC
Simpliy put all titles in a hash as key then print out all keys. This will elminate all duplicates. `use File::Find; my %titles; sub wanted { if( $_ =~ /\.html?$/) { my $name = $File::Find::name; open ( F, $name ) or die "$!: $name\n"; while($line = <F>) { if($line =~ /<title>(.+)<\/title>/i) { $titles{$1} = " "; } } close F; } foreach my $key (keys %titles) { print "Title = $key\n"; } } find( \&wanted, "/unixpath/webfiles" );` [download]	[reply] [d/l]