nurulnad has asked for the wisdom of the Perl Monks concerning the following question:
Where "2CU3" can be replaced by anything, in my code the variable for this is $input, and all the structureId's are contained within a text file named 'data.txt'.
After that, I want to get a link from that webpage. The link url is http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=FASTA&compression=NO&structureId=2CU3.My problem is, when this link is downloaded, it's just junk if I open it in textedit. If I open it using TextWrangler, the content is fine. Any idea what is causing this and how to fix it?
My code is as follows:--------------------------------------------------------------------#!/usr/bin/perl use strict; use WWW::Mechanize; open (FILE, "data.txt"); my $input; while ($input = <FILE>){ chomp $input; #download PDB html page my $url = "http://www.rcsb.org/pdb/explore.do?structureId="."$input"; my $mech = WWW::Mechanize->new( autocheck => 1 ); $mech->get( $url ); #write extracted data to an output file (.html) my $file = "$input".".html"; print "$file"; use Data::Dumper; open (OUTFILE, "> $file"); print OUTFILE Dumper($mech); close(OUTFILE); #download the link (FASTA sequence) my $linkname = "fileFormat=FASTA&compression=NO&structureId="."$input +"; my @links = $mech->find_all_links( url_regex => qr/$linkname/ ); for my $link ( @links ) { my $url = $link->url_abs; my $filename = $url; $filename =~ s[^.+/][]; print "Fetching $url"; $mech->get( $url, ':content_file' => $filename ); print " ", -s $filename, " bytes\n"; } } close (FILE);
Thanks for the replies. This isn't a TextEdit question, I can't manipulate the data that I get because they're junk. What I mean by junk is that instead of text, I get (*^%&&^(*&^(* sort of stuff. It perplexes me how this data can be seen properly on TextWrangler.
I'll try all your suggestions today. Thanks again!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: using WWW::Mechanize to download a link, opened fine in TextWrangler but as junk in TextEdit
by graff (Chancellor) on Aug 18, 2010 at 09:01 UTC | |
by nurulnad (Acolyte) on Aug 19, 2010 at 13:50 UTC | |
|
Re: using WWW::Mechanize to download a link, opened fine in TextWrangler but as junk in TextEdit
by cdarke (Prior) on Aug 18, 2010 at 08:03 UTC | |
|
Re: using WWW::Mechanize to download a link, opened fine in TextWrangler but as junk in TextEdit
by Khen1950fx (Canon) on Aug 18, 2010 at 10:58 UTC | |
by nurulnad (Acolyte) on Aug 19, 2010 at 13:44 UTC |