Red_King has asked for the wisdom of the Perl Monks concerning the following question:

All right...look at this piece of code, and please ignore the fact that i could use LWP to do the same thing...
#!/usr/bin/perl -w # collects the daily pic from the nasa web site $baseurl = "http://antwrp.gsfc.nasa.gov/apod/"; $fullurl = "http://antwrp.gsfc.nasa.gov/apod/astropix.html"; $newurl = ""; $status = ""; # get the html source for the page @source = `lynx -source $fullurl`; # scan the source and find the url for the picture foreach (@source) { if (m/IMG/i) { s/<IMG SRC=//i; # strip off the front of the tag s/><\/a>//i; # strip off the back of the tag s/\"//g; # strip off the double quotes $newurl = $baseurl.$_; print "retrieving $newurl"; } } $status = system("lynx -source $newurl > wallpaper.jpg");
My problem is that the last line creates an empty file every time. I can run the same line from the shell and it retrieves the picture. I wrote another script:
#!/usr/bin/perl $url = "http://antwrp.gsfc.nasa.gov/apod/image/0102/iss_sts98a.jpg"; print "retrieving $url"; $status = system ("lynx -source $url > wallpaper.jpg");
----------------------------------------------------------- just to see whats wrong and it runs fine. What am i doing wrong, and how can i fix it? Thanks in advance.

Replies are listed 'Best First'.
Re: system() frustration
by arturo (Vicar) on Mar 01, 2001 at 00:49 UTC

    I notice your main foreach loop continues on after you've found the match. That's not the best idea (one you've got a hit, so to speak, break out of the loop with last).

    A little more debugging info might help: you could print out the string you send to system before you make the call to system, to make sure it's executing the command you think it is.

    A smaller note: you can use regex memory to get at the image URL, in one swell foop:

    ($newurl) = /img src="([^"]+)/i; # i.e. grab what's in between the quo +tes after the src= stuff

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor

Re: system() frustration
by BlueLines (Hermit) on Mar 01, 2001 at 00:46 UTC
    try chomping $newurl. You create $newurl by appending $_ to $baseurl, but $_ will have a newline at the end of it since you never explicitly removed it.

    BlueLines

    Disclaimer: This post may contain inaccurate information, be habit forming, cause atomic warfare between peaceful countries, speed up male pattern baldness, interfere with your cable reception, exile you from certain third world countries, ruin your marriage, and generally spoil your day. No batteries included, no strings attached, your mileage may vary.
Re: system() frustration
by Red_King (Initiate) on Mar 01, 2001 at 01:10 UTC
    Heh. Chomping $newurl did the trick. And thanks to the other replier for the cooler way to grab the url. :)
Re: system() frustration
by Masem (Monsignor) on Mar 01, 2001 at 01:18 UTC
    s/<IMG SRC=//i; # strip off the front of the tag s/><\/a>//i; # strip off the back of the tag
    IMG tags in HTML have no back tag, that is, it's unary tag. Your second regex should be s/>//i, otherwise, since not all images are in anchors, you'd be leaving the trailing >, and http://some.com/image.jpg> would be empty from most servers.

    but as someone else pointed out, it's easier to just grab the image from one regex.

Re: system() frustration
by setantae (Scribe) on Mar 01, 2001 at 02:58 UTC
    As well as the good comments above, it's not beyond belief that one day there'll be something on the line other than just the IMG tag (as it is you're already having to throw away that <\a> that's already there).
    I'd check for that too.

    setantae@eidosnet.co.uk|setantae|www.setantae.uklinux.net