Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: regex question (new)

by kilinrax (Deacon)
on Nov 10, 2000 at 21:21 UTC ( [id://40975]=note: print w/replies, xml ) Need Help??


in reply to regex question (new)

Go and read Death to Dot Star! >;->
The problem is that your regex, while not greedy, still matches as early as possible, causing it to match things like 'stuff</td><td>email@email.com'. If you replace the dots with a negated character classes, preventing them from matching the angle brackets of the <td> tags, then it should work perfectly:
#!/usr/bin/perl -w use strict; my $data = 'any number of td/tds><td>stuff</td><td>email@email.com</td +><td>more stfff</td><td>next@next.co.uk</td><td\>r.h@a.com</td>'; my @emails = ($data =~ /<td>([^>\@]+?\@[^<\@]+)<\/td>/g); print join "\n", @emails;
However, this is definitely a job for Email::Find:
#!/usr/bin/perl -w use strict; use Email::Find; my $data = 'any number of td/tds><td>stuff</td><td>email@email.com</td +><td>more stfff</td><td>next@next.co.uk</td><td\>r.h@a.com</td>'; find_emails($data, sub { my($email, $orig_email) = @_; print $email->format."\n"; return $orig_email; });

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://40975]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-25 12:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found