OK below is a copy of the code i have so far... I need to know if I am on the right track, and I need to know one last thing: how to extract the data from the site after opening its links and saving it as a .dat file on linux.
------------------------------------------------------------ #!/usr/bin/perl -w use strict; use LWP::Simple; my $index = shift; # ## assuring that the site still exists # my $base = "http://vortex.plymouth.edu/cgi-bin/gen_uacalplt-u.html"; die "Cloudn't get it!" unless define $content; print "Found:\n$content\n"; # ## fetching radiosonde data from web # my @hr = (00, 12) foreach my $hr (@hr) { my ($url) = $content =~ m{ http://vortex\.plymouth\.edu/cgi-bin/gen_uacalplt-u\.cgi?id=${inde +x}&pl=none&yy=05&mm=08&dd=24&hh=${hr}&pt=parcel&size=640x480 }smx; push (@urls, $url) if (defined ($url)); } print "URLs found: @urls\n"; ------------------------------------------------------------
As you can tell from the code: http://vortex.plymouth.edu/uacalplt-u.html (is the base site) From here you type in the data: KMIA (index for radiosonde data for Miami) Sounding data (text) (scroll down) 2005 (year) Aug (mo) 24 (day) 0z (hr) parcel 640x480 (size) to get the folling link: http://vortex.plymouth.edu/cgi-bin/gen_uacalplt-u.cgi?id=KMIA&pl=none&yy=05&mm=08&dd=24&hh=00&pt=parcel&size=640x480 My logic was to open up the base site first, and confirm that it still exists, hence the print. Then using the foreach (to do the 00z and 12z hours) and the command line for the index ($index = shift;) to open up this link. Now I would like to save all the data that is above "Sounding variables and indices" into a text file ( titled "$index_2005_237_$hr.dat". My question is how do I do that. I would greatly appreciate your help... Please email me back as soon as possible.

In reply to screen scraper help by MKevin
in thread Screen scraper by MKevin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.