in reply to Re: New Module Consideration?
in thread New Module Consideration?

Nested comments happen in practice so if that code doesn't handle them then it's not powerful enough for real-world use. Anyone know some code that actually handles addresses fully?

Update Or maybe that's Aristotle's Email::Valid. I am so tired this morning


Fun Fun Fun in the Fluffy Chair

Replies are listed 'Best First'.
Re: Re^2: New Module Consideration?
by Flame (Deacon) on Jan 01, 2003 at 15:58 UTC
    Umm, can you clarify what you meant there? What nested comments?



    My code doesn't have bugs, it just develops random features.

    Flame ~ Lead Programmer: GMS | GMS

      Comments as a structure may be inserted anywhere within an address though I only ever see them on the ends. Comments are delimited by matching parentheses pairs. So (this (is) a) nested comment. Here is a sanitized version of my e-mail address test set. [Added inline It does not address quoted domains, internal comments or non-ASCII 8-bit characters. A full validator should probably at least allow for Unicode and iso-8859-?. I've never seen quoted domains or internal comments so that's likely just something that is allowed but no one uses these days.]

      "Cardamom" cardamom@spice.com This address is not RFC822 compliant. The address@company.com portion should be either be enclosed in <> angle brackets or the double-quote construct should be replaced with a (Joshua Jore) structure. A validator should still be able to extract the machine readable address.

      "Ginger" <ginger@spice.com>: This is the most common format and correctly delimits the machine-readable portion from everything else in the field.

      (Lemon Peel) lpeel@spice.com This is also correct and occurs in practice. In this case the entire string is taken to be the machine-readable portion after the with the comment construct is removed.

      (Orange Zest) <ozest@spice.com>: Pretty normal - a comment and a machine readable portion.

      "Red (hot!) pepper" rhpepper@spice.com: Broken and not RFC822 compliant. Your validator should distinguish the extraneous (but not commented) text from the machine parsable address.

      (Black (and white) pepper) <bpepper@spice.com>: A normal address.

      "Fish Oil" (foil@yahoo.com) <foil@spice.com>: Again normal, the machine parsable portion is expicitly noted so everything else can just be ignored. This means foil@yahoo.com is not the address and must be correctly distinguished.

      Bug Blatter (beast@trall.com) <gbeans@spice.com>: Ditto. This is an extension on the previous example.

      "Bug Blatter"@trall.com: This is tricky for some validators to handle though in this case the entire machine-readable portion includes the double-quoted region with the space. This is a great demonstration that you can't just split on white space and look for words with @ symbols.


      Fun Fun Fun in the Fluffy Chair