natxo has asked for the wisdom of the Perl Monks concerning the following question:

hi, using Email::Mime I am parsing incoming e-mail messages and that works great. Now I want to save some fields in a sqlite database and want to validate some addreses. For that purpose I use Email::Valid. this code works ok as long as $from unicode free is:
my $validator = Email::Valid->new(); my $addr = $validator->address( $from ); print $addr, "\n";
But if $from contains something like this:

Jos� Name <J.name@domain.tld> because in the message From: is

From: =?iso-8859-1?Q?Jos=E9_Name?= <J.name@domain.tld> Then I get Use of uninitialized value $addr in print ... Any clue as to how to get around this problem? Thanks!

Replies are listed 'Best First'.
Re: unicode problem with Email::Valid
by McA (Priest) on Jul 07, 2014 at 13:41 UTC

    Hi,

    the problem is that the from address is an email header part. And in a header only ASCII is valid. That's the reason why only ASCII encoded strings can be valid. Therefore the word containing the non-ascii character é is encoded as =?iso-8859-1?Q?Jos=E9_Name?=.

    When you take this byte string into the validation routine everything is fine. So, IMHO, the solution must be that you encode the unicode string representing the email address in a valid ascii representation.

    #!/bin/perl use strict; use warnings; use 5.010; use Email::Valid; use Data::Dumper; use Encode qw(encode decode); my $utf8_from = decode('UTF-8', 'José <J.name@web.de>'); my $from = encode('MIME-Header', $utf8_from); say "Mail: $from"; my $validator = Email::Valid->new(); if(my $addr = $validator->address( $from )) { say "OK: ", Dumper($addr); } else { say "Not valid"; }

    Regards
    McA

      yes! thanks for the explanation, it makes perfect sense.