higginss20 has asked for the wisdom of the Perl Monks concerning the following question:

I'm hoping someone here can help, or at least point me in the right direction. The company I work for has a Barracuda Email Security Appliance that uses Perl RegEx expression for custom content filtering. We regularly have people phishing as our CEO as the display name in an email but obviously not his actual email address. I'm trying to figure out a way using RegEx to block any email with his display name in it but not his email address in the header info. The line I'm looking at in the email header is the From: displayname <email@domain> line. I've tried something similar to this and it doesn't work, this is fictitious info obviously

From: John Smith <[^j][^o][^h][^n]\.[^s][^m][^i][^t][^h][^@][^g][^m][^a][^i][^l]\.[^c][^o][^m]

I know this likely isn't the best way to do this but I know next to nothing about RegEx and even less about Perl, but everything sent from any email address with his display name is still delivered. Yes I tried contacting Barracuda but they no longer offer RegEx support.

Replies are listed 'Best First'.
Re: Regex Expression to filter email for Barracuda Email Appliance
by roboticus (Chancellor) on Feb 12, 2021 at 18:27 UTC

    higginss20:

    They may not support regular expressions simply because there are better ways to handle the problem. I'd suggest contacting Barracuda not with a RegEx question, but a "what's the best way to avoid this situation" question, as they're surely familiar with the problem and have well-tested methods for handling this problem.

    If you had to use regular expressions, then noticing his display name and reject all the spammy variations is tougher than it might appear to be, especially if the appliance doesn't use full Perl regular expressions. Not only do you have to deal with minor formatting variations, but there may be difficulties with Unicode look-alike characters that could fool you and other assorted nonsense. I'm nowhere near a regex guru, so perhaps another monk might chime in with some good suggestions--there are several monks here that always impress me with their regex-based solutions to some problems.

    If I had to solve the problem, then some things I might try would be (assuming you can do it with the device in question):

    • Whitelist "John Smith <john.smith@gmail.com>" so you always accept the proper address, then try to recognize a display name of "John Smith" to add a warning to any incoming EMail (as there are certainly other John Smiths in the world that may want to communicate with your company).
    • Sidestep the problem by simply removing any display name so employees will only see the real EMail address. Maybe also add a warning indicator to your EMails on any non-internal domains (as I'm guessing you used gmail.com just as a stand-in).

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      I think this is the right answer. Pay for the service or software that experts in this problem domain provide for solving the problem. The OP's heuristics will eventually fall short of being adequate, or will generate false positives in some esoteric cases that don't matter until something important gets rejected. There are people for whom email safety is a core competency. For the rest of us, there are better problems to spend time solving.

      On the other hand, if a regex solution really is needed, I'd still break it into two; one that triggers if the person's name is detected, and another that validates there is a proper email address, or that flags if there is not.

      If it has to be done from a single regex, you could do a negative lookahead, but it just gets more complicated. Here's a working example with a negative lookahead. But I consider it fragile:

      #!/usr/bin/env perl use strict; use warnings; use feature qw(say); my @strings = ( 'From: John Doe <peter@piper.com>', 'From: John Doe <john@doe.com>', 'From: John Doe', ); foreach my $string (@strings) { if ($string =~ m/^From:\s+John\s+Doe(?!\s+<john\@doe\.com>)/) { say "BAD: $string" } else { say "GOOD: $string" } }

      Dave

Re: Regex Expression to filter email for Barracuda Email Appliance
by stevieb (Canon) on Feb 12, 2021 at 18:11 UTC

    Is this inbound or outbound traffic? It's been 10+ years since I've worked with a Barracuda cluster.

    Either way, you'd think that since it's a spam appliance, it would by default be able to identify this blatant spam.

    Can you show an example email, and in which direction its traveling? There may be better approaches than regex. Using regex for filtering can have serious accidental consequences (from my experience).

Re: Regex Expression to filter email for Barracuda Email Appliance
by kcott (Archbishop) on Feb 13, 2021 at 00:26 UTC

    G'day higginss20,

    Welcome to the Monastery.

    I think, in the first instance, you should follow ++roboticus' advice and contact Barracuda with a different question.

    As a general rule, using a regular expression to match an exact string is a poor and inefficient choice. It would be much better to just use 'eq', or 'ne' to check for a non-match (see "perlop: Equality Operators"):

    if ($email_address eq 'first.last@example.com') { # OK - process normally } else { # Possibly not OK - phishing check }

    If $displayname has more than one email address, use a hash and check with exists:

    my %valid_email_for = ( 'Display Name' => { 'first.last@example.com' => 1, 'first.last@example.net' => 1, 'first.last@example.org' => 1, }, ); ... if (exists $valid_email_for{$displayname}{$email_address}) { # OK - process normally } else { # Possibly not OK - phishing check }

    — Ken

Re: Regex Expression to filter email for Barracuda Email Appliance
by Radiola (Monk) on Feb 13, 2021 at 00:52 UTC

    Can you combine a regex lookup with another rule? That makes it an awful lot simpler. We have rules in Office 365 something like the following:

    1. Address matches "first.?last", AND
    2. Domain IS NOT "example.com"

    (Watch out for inbound mail from users’ personal addresses if you do that, though.)

    I don’t remember if the Barracudas can do compound rules. (Like stevieb, I used to admin Barracudas, but it’s been years.)

    This really isn’t a Perl question, as such; Barracuda uses Perl-compatible regular expressions (as does Microsoft, for that matter), but unless something’s changed you can’t access the language from the admin interface. You might have better luck on a forum targeted more towards admins.

    Update: Gah, it was stevieb who said he'd worked with Barracudas, not kcott.

    – Aaron
    Preliminary operational tests were inconclusive. (The damn thing blew up.)
Re: Regex Expression to filter email for Barracuda Email Appliance
by LanX (Saint) on Feb 12, 2021 at 23:49 UTC
    Why has it to be a single regex implementing all the logic?

    You could simply match /From: (.*?) <(.*?)>/ and than check $1 and $2 with Perl

    Far better maintainable.

    Unless ... this isn't really a Perl question?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: Regex Expression to filter email for Barracuda Email Appliance
by Anonymous Monk on Feb 12, 2021 at 20:45 UTC
    FYI, this sort of thing might save you some time: Regexp::Common::Email::Address. There are lots of good time-savers in that large area of CPAN. Always very nice if somebody else does the "Regex twiddling" for you.
      And wear sunscreen.

      If I could offer you only one tip for the future, sunscreen would be it.

      The long term benefits of sunscreen have been proved by scientists, whereas the rest of my advice has no basis ...