BUU has asked for the wisdom of the Perl Monks concerning the following question:

Quick, simple question. What would be the most effective way to check if a large number of domains are registered? I'm sure you can guess why I want to do this. The idea of a doing a whois query against the online database strikes me as A) Slow, and B) bad mannered. Any ideas?

Replies are listed 'Best First'.
Re: Massive WHOIS queries?
by ferrency (Deacon) on May 19, 2003 at 14:33 UTC
    If I'm reading you right, you may want to visit a domain-snapping service such as www.snapnames.com.

    For non-registrars, the main ways you can check for a domain's existence are to make a bunch of whois queries, or to make a bunch of DNS queries.

    As you said, whois queries are slow, and may be considered bad-mannered. It's also against many Terms of Service to scrape whois information for spamming/marketing purposes. But in some cases there is no better alternative.

    DNS queries can weed out registered domains, but won't give you information on when they expire. It also won't tell you 100% whether a domain is not registered, since domains which are on HOLD don't show up in DNS even if they're registered. Overall, DNS queries are much friendlier, but give you less information.

    If you're a registrar, you already know the better ways to do parts of what you're asking; or if not, you can ask the registry in question :)

    Alan

Re: Massive WHOIS queries?
by hardburn (Abbot) on May 19, 2003 at 14:29 UTC

    . . . doing a whois query against the online database . . .

    That is the only way I can think of to do it unless you have a central database of all registered domains that you can check against. If you want to keep the noise down, you could split off the work into some child processes, giving each one a part of the list of names to check against. Each child process would query a different name server, thus spreading the load out.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

Re: Massive WHOIS queries?
by halley (Prior) on May 19, 2003 at 14:28 UTC

    "I'm sure you can guess why I want to do this."

    It wouldn't have anything to do with VeriSign's newly approved patent on a method and system to invoke a number of parallel queries against domain name services, would it?

    --
    [ e d @ h a l l e y . c c ]

Re: Massive WHOIS queries?
by arthas (Hermit) on May 19, 2003 at 14:46 UTC
    A large number of whois queries would probably be the only reliable solution to check wether a domain is registered, and the (public) data about that registration.

    Also, the central database (whois.internic.net) will only return you some of the data (such as the expiration date and namservers). If you need to get information about other things such as the registrant, you need to query the referral that InterNIC provides you (i.e. whois.joker.com, whois.godaddy.com, ...). So, you'll actually need two queries for most domains.

    If you ever though about using DNS, forget it: it's unreliable.

    Michele.
Re: Massive WHOIS queries?
by robobunny (Friar) on May 19, 2003 at 15:48 UTC
    the most efficient way to do this yourself is a combination of DSN and WHOIS. first, do a DNS query for type ANY. if you get something back, go to the next domain. if you don't, then do a WHOIS query. that way, you are doing the minimal amount of WHOIS queries. you can also make the DNS and WHOIS portions separate processes, so that the WHOIS queries are throttled independently.