Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

With my usual timeliness I've taken up attempting to getting Net::XWhois to behave. My local whois server returns output in this format:
Domain: myDomain.int DNS: myDomain.int Registered: 2013-05-17 Expires: 2023-05-31 Registration period: 1 year VID: no DNSSEC: Unsigned delegation, DNSSEC disabled, no records Status: Active Registrant Handle: ***N/A*** Name: myName Corp Attention: Reggie Person Address: SomeStreet 17 Postalcode: numericZip City: myCity Country: XX Phone: +12 34 56 78 90 Nameservers Hostname: ns1.dom.ext Hostname: ns2.dom.ext Hostname: ns3.dom.ext
So, in order to handle this particular format I've registered a parser (using Net::XWhois::register_parser())) and also an association (Net::XWhois::register_association()) in the hopes that this would work.

According to the source (https://metacpan.org/dist/Net-XWhois/source/lib/Net/XWhois.pm#L660), the re's in the parser definition are called with the /sg modifiers.

So, off to an on-line regex tester to see if my idea would work: https://regex101.com/r/koIpfH/1

Great, so far, so good.

Then I created a small script in order to avoid "hammering" the local whois server:

#!/bin/perl use re 'debug'; my $regexp='(?:Nameservers[^\n]*\n.*?)*(?>Hostname:\s+([\S]+)\n)'; my $resp = " Domain: myDomain.int DNS: myDomain.int Registered: 2013-05-17 Expires: 2023-05-31 Registration period: 1 year VID: no DNSSEC: Unsigned delegation, DNSSEC disabled, no records Status: Active Registrant Handle: ***N/A*** Name: myName Corp Attention: Reggie Person Address: SomeStreet 17 Postalcode: numericZip City: myCity Country: XX Phone: +12 34 56 78 90 Nameservers Hostname: ns1.dom.ext Hostname: ns2.dom.ext Hostname: ns3.dom.ext "; # my $regexp = shift; if ( $regexp !~ /^CODE/ ) { my @caps = $resp =~ /$regexp/sg; print "\n\n"; print "caps\n",join(',',@caps),"\n"; }else{ print "IS code\n"; exit; }
However, I seem to have missed something since I can't get the exact same regex to work in my parser definition :-(

Therefore, I beseech the monastary, I'm in need of elucidation. Here's my parser definition etc.:

my $w = new Net::XWhois( Server => 'whois.int'); $w->register_parser ( Name => 'PDEF', Retain => 0, Parser => { name => 'Domain:\s+(\S+)\n', dnsname => 'DNS\s+(\S+)\n', nameservers => '(?:Nameservers[^\n]*\n.*?)*(?>Hostname:\s+ +([\S]+)\n)', registrant => '(?:[rR]egistrant)[^\n]*\n([\n\s\S]*?)[\n\r +\f]{2}', status => 'Status:\s+(.*?)\n', remarks => '(DNSSEC:.*?)\n', reg_date => '[rR]egistered:\s+(.*?)\n', reg_period => '[rR]egist.*?[pP]eriod:\s+(.*?)\n', exp_date => '[eE]xpires\s+(.*?)\n', }, ); $w->register_association ( 'whois.int' => [ PDEF, [ qw/int/ ] ] ); $w->lookup ( Domain => '--show-handles --charset=utf-8 '."myDomain.int +" );

Replies are listed 'Best First'.
Re: /re/ issue - hacking Net::XWhois
by kcott (Archbishop) on Sep 05, 2022 at 17:30 UTC
    "... I can't get the exact same regex to work ..."

    That's a very poor problem report. In what way doesn't it work? Where's your expected and actual output? See "Short, Self-Contained, Correct Example".

    With this code:

    #!/usr/bin/env perl use strict; use warnings; my $regexp='(?:Nameservers[^\n]*\n.*?)*(?>Hostname:\s+([\S]+)\n)'; my $resp = " Domain: myDomain.int DNS: myDomain.int Registered: 2013-05-17 Expires: 2023-05-31 Registration period: 1 year VID: no DNSSEC: Unsigned delegation, DNSSEC disabled, no records Status: Active Registrant Handle: ***N/A*** Name: myName Corp Attention: Reggie Person Address: SomeStreet 17 Postalcode: numericZip City: myCity Country: XX Phone: +12 34 56 78 90 Nameservers Hostname: ns1.dom.ext Hostname: ns2.dom.ext Hostname: ns3.dom.ext "; my @caps = $resp =~ /$regexp/sg; print "@caps\n";

    I get this output:

    ns1.dom.ext ns2.dom.ext ns3.dom.ext

    The regex itself appears to be doing what you want. Perhaps the problem lies elsewhere — waiting to hear.

    — Ken

      Ahh - I apologize for the omission.

      Yes, my test code works fine.

      However, after calling $w->lookup(...), $w->response() contains the expected response, but $w->nameservers() is empty. $w->registrant() is fine though.

        You show the end of "My local whois server returns output in this format:" as

        Hostname: ns3.dom.ext

        However, the end of your $resp adds two additional \ns. If I remove those from my test code (i.e. my $resp = "... ns3.dom.ext";) the output becomes:

        ns1.dom.ext ns2.dom.ext

        which isn't empty but is different.

        If I change the end of $regexp, in my test code, from \n) to \n?), the output returns to:

        ns1.dom.ext ns2.dom.ext ns3.dom.ext

        So maybe that's something to look at in your code.

        In my original response, I wrote 'See "Short, Self-Contained, Correct Example".'; I perhaps should have been more specific and written 'Please supply a "Short, Self-Contained, Correct Example".'.

        If you do provide an "SSCCE" which reproduces your problem, along with expected and actual output (as opposed to vague, prosaic descriptions: "contains the expected response", "is empty" and "is fine"), I'll happily take another look. At the moment, I'm just making guesses about what you're not showing us.

        You should also read, and follow the guidelines in, "How do I post a question effectively?".

        — Ken