Re: Re: Re: Getting unique data from search.

You didn't follow his advice; he said "change the print statement." Not "add this at the bottom." The code should look like this:

  use strict;
  use File::Find;

  my %seen;

  sub wanted
  {
   if ( /\.html$/)
   {
     open F, "< $File::Find:name" 
        or die "read $File::Find::name: $!\n";
     local $_;
     while (<F>)
     {
        if ( /<title>(.*?)<\/title>/si )
        {
           print "Title = $1\n" unless $seen{$1}++;
        }
     }
     close F;
   }
  }

  find( \&wanted, "/unixpath/webfiles" );
[download]

jdporter
...porque es dificil estar guapo y blanco.

Comment on Re: Re: Re: Getting unique data from search. Download Code

Replies are listed 'Best First'.
Re: Re: Re: Re: Getting unique data from search. by Anonymous Monk on Nov 14, 2002 at 19:41 UTC
thanks, but not sure what and how this works??? `print "Title = $1\n" unless $seen{$1}++;` [download] Can someone explain it to me?	[reply] [d/l]
Re: Re: Re: Re: Re: Getting unique data from search. by jdporter (Paladin) on Nov 14, 2002 at 20:12 UTC
Sure. We have this little section: `if ( /<title>(.?)<\/title>/si ) { print "Title = $1\n" unless $seen{$1}++; }` [download] The first line uses a regex to find things enclosed by `<title>` tags. Because of the grouping parentheses, whatever is matched gets magically assigned to the special `$1` variable. (Note, that's the numeral one, not a lower-case ell.) (If you had more paren groups, what they matched would be assigned to `$2, $3,` etc.) The stuff inside the `if` block is really just a fancy (compact) way of writing `$seen{$1}++; if ( ! $seen{$1} ) { print "Title = $1\n"; }` [download] Now is it clearer, I hope? Footnote:* Why the `/s` and `/i` modifiers on the regex? Well, the `/i` is so we can match `<title>`, `<TITLE>`, or any other variant of case. The `/s` is so the dot in the pattern can match linebreak characters, in case someone cleverly wrote something like `<title>This is a very Long Title</title>` [download] jdporter ...porque es dificil estar guapo y blanco.	[reply] [d/l] [select]