Fetch Problem uri

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Fetch Problem uri by Perlbotics (Archbishop) on Jul 04, 2015 at 09:51 UTC
Works for me when I remove the leading whitespace from `$url` ;-) Update (2nd question - see below): Change destination filename. Seems, that the module did not provide a setter-method to change the `output_file()`? I'll notify the author later... This monkeypatch should correct that: #!/usr/bin/perl use strict; use warnings; use File::Fetch; my $url="http://www.ekey.net/downloads-475?download=2132cbe2-2fb1-eeff +-583c-50a39b6aba6c&name=v2_ITA_12-Seiter_Programm_1207_web$ print "Downloading $url\n"; my $ff = File::Fetch->new(uri => $url); { #-- ugly patch package File::Fetch; print "PATCHED: ", __PACKAGE__, " version $VERSION!\n"; #-- patche +d line#0 sub output_file { my $self = shift; my $file = $self->file; $self->{_out} = $_[0] if $_[0]; #-- patche +d line#1 return $self->{_out} if $self->{_out}; #-- patche +d line#2 $file =~ s/\?.*$//g; $file \|\|= $self->file_default; return $file; } } package main; print "Before: ", $ff->output_file,"\n"; $ff->output_file("i-probably-violate-terms-of-use.pdf"); #-- or extra +ct name from URI using a regex print "After : ", $ff->output_file,"\n"; my $where = $ff->fetch( to => "ekey corpus" ) or $ff->error; [download] Result: `... PATCHED: File::Fetch version 0.48! Before: downloads-475 After : i-probably-violate-terms-of-use.pdf ...` [download] Fixing this problem by use of parent is left as an exercise to the AM...	[reply] [d/l] [select]
Re^2: Fetch Problem uri by Anonymous Monk on Jul 04, 2015 at 10:01 UTC
Oh, what an idiot! Thank you. There is anyway I smaller problem. The filename is wrongly parsed so that I get a file called downloads-475 (without .pdf). Any idea why? Or do I just have to try to parse it correctly by myself?	[reply]
Re: Fetch Problem uri by 1nickt (Canon) on Jul 04, 2015 at 10:08 UTC
Regarding your second problem, the docs for File::Fetch say: `$ff->output_file The name of the output file. This is the same as $ff->file, but any qu +ery parameters are stripped off. For example: http://example.com/index.html?x=y would make the output file be index.html rather than index.html?x=y.` [download] However, `output_file()` is an accessor only, so you can't change the value. UPDATE: Better explained and shown with a patch in the reply above. You probably would like to do something like: `my $ff = File::Fetch->new(uri => $url); my $output_name = $ff->name; $ff->file =~ /name=(.)$/ and $output_name = $1; # $output_name is now 'v2_ITA_12-Seiter_Programm_1207_web.pdf' $ff->output_file( $output_name );` [download] ... but that doesn't work. Update: The below errors were caused by the missing space in the filename. That's by (poor) design, but the module seems to have other problems, as the accessor methods don't seem to do what they say: `my $ff = File::Fetch->new(uri => $url); say "scheme: " . $ff->scheme; say "host: " . $ff->host; say "path: " . $ff->path; say "file: " . $ff->file; say "output_file: " . $ff->output_file; ## outputs: Use of uninitialized value in concatenation (.) or string at ./foo.pl +line 12. scheme: host: http: path: //www.ekey.net/ file: downloads-475?download=2132cbe2-2fb1-eeff-583c-50a39b6aba +6c&name=v2_ITA_12-Seiter_Programm_1207_web.pdf output_file: downloads-475` [download] Remember:* Ne dederis in spiritu molere illegitimi!	[reply] [d/l] [select]
Re^2: Fetch Problem uri by Anonymous Monk on Jul 04, 2015 at 12:10 UTC
Nice patch! Now I just have to figure out how to get the right file name out of the URI. Not so easy as it seems as the URI contains query parameters. I tried without success: `my $filename = (URI->new($url)->path_segments)[-1]; my ($volume,$directories,$filename) = File::Spec->splitpath( $url );` [download] Strange that there is no available module that seems to cope rightly ith this URI. Or maybe is the URI to be "non standard	[reply] [d/l]
Re^3: Fetch Problem uri by 1nickt (Canon) on Jul 04, 2015 at 14:45 UTC
(Scroll down for an answer to your latest question ...) Strange that there is no available module that seems to cope rightly ith this URI. Or maybe is the URI to be "non standard Your URI is fine (until you added a space at the start, heh). The module is maybe what is "non-standard," I am afraid. First the problem addressed by Perlbotics' patch; the method `$ff->output_file` not being a method to set the value, as it would appear to be. Then the ungraceful handling of a problem URI (e.g. with a leading space as in your OP): `my $url = ' http://www.perlmonks.com/foo?bar=baz'; print "Downloading >$url<\n"; # note use of delimiters to make a stray + # leading space more visible in your deb +ug my $ff = File::Fetch->new(uri => $url); say "scheme: " . $ff->scheme; say "host: " . $ff->host; say "path: " . $ff->path; say "file: " . $ff->file; say "output_file: " . $ff->output_file;` [download] `## outputs: Use of uninitialized value in concatenation (.) or string at ./foo.pl +line 10. scheme: # <- error host: http: # <- error path: //www.perlmonks.com/ # <- error file: foo?bar=baz output_file: foo` [download] These two things alone would make me consider looking for a different solution on CPAN. Now I just have to figure out how to get the right file name out of the URI. You are on the right track with a path-parsing module. But if all your files are of the format you showed, you might want to use a regexp: `#!/usr/bin/env perl -w use strict; my $url = 'http://www.ekey.net/downloads-475?download=2132cbe2-2fb1-ee +ff-583c-50a39b6aba6c&name=v2_ITA_12-Seiter_Programm_1207_web.pdf'; (my $output_name = $url) =~ s/^.name=(.)$/$1/; print "$output_name\n"; __END__` [download] `## outputs: v2_ITA_12-Seiter_Programm_1207_web.pdf` [download] Remember: Ne dederis in spiritu molere illegitimi!	[reply] [d/l] [select]
Re^4: Fetch Problem uri by Anonymous Monk on Jul 04, 2015 at 16:00 UTC