in reply to Getting all emails

I don't know if this is just a copy-and-paste thing, but your examples are all .company.com and your regular expression wants the target to end in .company. If that's really the case, then just make the regex: /\.(?:company?|aa\.company|bb\.company)\.com$/

Your question isn't very clear, by the way. Can you explain what the innacuracy in the count is? Are you getting some of the addresses, or none at all, or ...?

HTH

Replies are listed 'Best First'.
Re: Re: Getting all emails
by Anonymous Monk on Jun 19, 2003 at 17:55 UTC
    I rewrote as suggested and this reg expression doesnt give me any results.
    use File::Find; sub wanted { local *F; if( $_ =~ /\.html?$/) { my $name = $File::Find::name; open ( F, $name ) or die "$!: $name\n"; while($line = <F>) { if($line =~ /\.(?:company?|aa\.company|bb\.company)\.com$/i) { print "FILE = $_ email = $1\n"; } } close F; } } find( \&wanted, "/dirpath/here" );
    If I put in a reg expression like this: if($line =~ /\@company\.com/) it works but I need to really search for hits on any of the three listed above.

      The regex I showed before, because of the $ at the end, will only match if the string occurs just before a newline. So if you want to match things "in the middle of a line", just take the dollar sign out. Also, in your original question the regex started with a period, not an @ sign, and it looks like that's what you're after, so you should make that correction as well. /\@(?:company?|aa\.company|bb\.company)\.com/should do the trick. I'm still not sure why you have "company?". Is that just a typo? It will match compan or company, and it doesn't sound like you want the former. That's why I said you need to state your question more clearly.

      However, if everything I've mentioned so far is correct (you want to find strings in the middle of a line, starting with an @ sign, possibly followed by aa. or bb., then followed by company.com), then here's a simpler way to just state that: /\@(?:aa\.|bb\.)*company\.com/

        Thanks for all your help. It now fetches all as needed. I noticed it doesnt just fetch "aa.company.com" and "bb.company.com". It fetches anything that is in front of "company.com". So if there is an email with "myname@xx.company.com" the reg expression will fetch the "xx.company.com". Any way to correct it so it just fetches the three I mentioned earlier?
        anyname@aa.company.com anyname@bb.company.com anyname@company.com
        Here is the script that you fixed for me:
        use File::Find; sub wanted { local *F; if( $_ =~ /\.html?$/) { my $name = $File::Find::name; open ( F, $name ) or die "$!: $name\n"; while($line = <F>) { if($line =~ /\@(?:aa\.|\.bb)*company\.com/) { print "FILE = $_ ; } } close F; } } find( \&wanted, "/directory/path" );
        If this is all that can be done then I do thank you for helping me get this far. Thanks again!