Help with Header stripping

Traku has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to write a program which will retrieve a file of any format from a website. I can retrieve it, but the file will not open properly (error: Can't determine type).

Apparently I get this, because when I retrieve the file, it has an HTTP header. I found the pattern to remove the header, but I am not sure as to how to put it down in code.

For example when dealing with a GIF, I need to copy everything starting with GIF89a. I am not sure how to tell PERL to ignore everything before it.

btw, if you cant tell, heh, I just started using PERL two days ago.

Thanks for the help in advance!

PS This is what I have for code so far,

$socket->send("GET $1 HTTP/1.1\n");
$socket->send("HOST: www.dilbert.com\n");
$socket->send("\n");

#get response

#$request = <$socket>;

open(FILE1, ">gift_test.gif");

# write to file

while ($request = <$socket> && ($request =~ (whatever reg_ex goes here))
{
print FILE1 $request;
}

close (FILE1);

Comment on Help with Header stripping

Replies are listed 'Best First'.
Re: Help with Header stripping by matija (Priest) on Apr 01, 2004 at 18:20 UTC
You didn't say what method you are using to retrieve the file so far. Had you used LWP::Simple, you would get exactly the content you require like this: `use LWP::Simple; my $val=get "http://some/url/somewhere"; open(OUT,">some_file_name") \|\| die "Could not save to some_file_name" +$!\n"; binmode(OUT); # you only really need this sometimes. BStS. print OUT $val; close(OUT);` [download] If you need more complex queries you might need to use LWP::UserAgent, and you will find the exact value you require (i.e. content without the headers) in the `content` method of HTTP::Result object you will get back.	[reply] [d/l]
Re: Re: Help with Header stripping by Traku (Initiate) on Apr 01, 2004 at 18:58 UTC
Unfortunately I cannot use LWP. As the professor asked us not to. But I'll keep that in mind for next time!	[reply]
Re: Re: Re: Help with Header stripping by matija (Priest) on Apr 01, 2004 at 19:19 UTC
Ah, if it's homework, you should have told us that, and the parameters of that homework. Here is a hint for you: When you look at the communication with your webserver, you will first see the echo of your request (maybe that only happens with telnet, you need to check), terminated by a blank line. After that you see the headers sent by the server, terminated by a blank line. You do not need to know how the wanted content starts. All you need to know is how the unwanted content ends. There, that should be enough of a hint :-)	[reply]
Re: Re: Re: Re: Help with Header stripping by Traku (Initiate) on Apr 01, 2004 at 20:21 UTC
Re: Re: Re: Re: Re: Help with Header stripping by matija (Priest) on Apr 01, 2004 at 20:38 UTC
Some notes below your chosen depth have not been shown here