Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to provide a "nicer" UserAgent string in my code, pointing to a page that references the use of the code. The module I'm using is XML::RSS::Tools, and it uses either LWP internally, or HTTP::GHTTP if found (it claims ghttpd is faster).

All of the magic in XML::RSS::Tools happens in a sub called _http_get(), which looks like:

# Try and use Gnome HTTP, it's faster eval { require HTTP::GHTTP; }; if ($@) { # Otherwise use LWP require LWP::UserAgent; my $ua = LWP::UserAgent->new; $ua->agent("iC-XML::RSS::Tools/$VERSION " . $ua->agent . " ($^O)"); [...] } else { my $r = HTTP::GHTTP->new($uri); $r->process_request; my $xml = $r->get_body; [...] } else { [...] } }

What I'm trying to do, is override that use of the LWP::UserAgent call there, or override the default one that HTTP::GHTTP uses ("libghttp/1.0").

How can I do this?

Replies are listed 'Best First'.
Re: Overriding module-internal calls
by Aristotle (Chancellor) on Apr 21, 2003 at 19:58 UTC
    Untested.
    { my $real_meth = \&LWP::UserAgent::agent; local *LWP::UserAgent::agent = sub { $_[1] = "Camel Power 3.14.15" if $_[1] =~ /XML::RSS/; goto &$real_meth; }; $rss_object->rss_uri('http://foo/bar/index.rss'); }
    The much safer alternative is to simply fetch the RSS yourself and feed it to the class as a string.

    Makeshifts last the longest.

      This seems to fix it, though is a bit slow, having to load up LWP::UserAgent to do the work:
      use strict; use warnings; use XML::RSS::Tools; use LWP::UserAgent; my $rss_feed = XML::RSS::Tools->new; my $ua = LWP::UserAgent->new; $ua->agent('Camel Power 3.14.15 [rss])'); my $rss = "http://www.perl.com/pace/news.rss"; my $request = HTTP::Request->new(GET => $rss); my $response = $ua->request($request); my $status = $response->status_line; my %errors = ('500'=>'Bad hostname supplied', '501'=>'Protocol not supported', '404'=>'URL not found', '403'=>'URL forbidden', '401'=>'Authorization failed', '400'=>'Bad request found', '302'=>'Redirected URL' ); ($status) = ($status =~ /(\d+)/); if (defined($errors{$status})) { die "ERROR: $errors{$status}\n"; } else { my $content = $response->content; $rss_feed->rss_string($content); $rss_feed->xsl_file('rss.xsl'); $rss_feed->transform; print $rss_feed->as_string; }

      I'll hit it with benchmark and see if maybe one of the other feching modules can do the same thing faster. Any suggestions on alternate modules that can do the same thing? What I require is:

      1. Providing a custom UserAgent string
      2. Being able to do HEAD and GET on the URI
      3. Having a proper status_line come back for error trapping

        I'll hit it with benchmark and see if maybe one of the other feching modules can do the same thing faster.

        The speed overhead of ANY way to get the file, even if you fork curl for it, is absolutely irrelevant compared to the overhead of actually requesting the document and waiting for and getting its headers and body.

        Juerd
        - http://juerd.nl/
        - spamcollector_perlmonks@juerd.nl (do not use).
        

      Nada, same story, and gives me:
      "iC-XML::RSS::Tools/0.09 libwww-perl/5.69 (linux)"

      There must be some way to do this.. thanks for the suggestion though.

        As I said (and hacker implemented) - the safe, sane and solid way would be to just fetch the document yourself and feed it to XML::RSS::Tools as a string.

        Makeshifts last the longest.

Re: Overriding module-internal calls
by valdez (Monsignor) on Apr 22, 2003 at 12:57 UTC

    Why not reimplement _http_get method? You can subclass XML::RSS::Tools like so:

    package XML::RSS::Tools::Mine; use base 'XML::RSS::Tools'; use strict; use warnings; use LWP::UserAgent; # # Grab something via HTTP # sub _http_get { my $self= shift; my $uri = shift; my $ua = LWP::UserAgent->new; $ua->agent('my_user_agent'); my $response = $ua->request(HTTP::Request->new('GET', $uri)); return $self->_raise_error("HTTP error: " . $response->status_line) +if $response->is_error; return $response->content(); } 1;

    Then change the calling code:

    use XML::RSS::Tools::Mine; my $rss_feed = XML::RSS::Tools::Mine->new; $rss_feed->rss_uri('http://freshmeat.net/backend/fm-releases.rdf'); print $rss_feed->as_string(rss), "\n";

    HTH, Valerio