To get you started.

...feeding the data to the online tool...
Have a look at the LWP and WWW-Mechanize

...getting the results...
Have a look at HTML::TokeParser

Update:
Added WWW-Mechanize

Update 2:
This gets the html:

#!/usr/bin/perl use strict; use warnings; use WWW::Mechanize; # field name: inseq my $url = q|http://thr.cit.nih.gov/molbio/hla_bind/index.shtml|; my $seq = 'EALLKQSWEVLKQNIPGHSLCLFALIIEAAPESKYVFSFLKDSNEIPENNPKLKAHAAV +IFKTICESATE LRQKGQAVWDNNTLKRLGSIHLKNKITDPHFEVMKGALLGTIKEAVKENWSDEMCCAWTEAYNQLVATIK AEMKE'; my $mech = WWW::Mechanize->new() or die "couldn't get Mech object: $!"; $mech->get($url) or die "couldn't 'get': $!"; $mech->submit_form( form_number => 1, fields => { inseq => $seq, } ) or die "couldn't submit form"; my $html = $mech->content() or die "content failed: $!"; { my $file = 'bio_output.html'; open my $fh, '>', $file or die "can't open $file: $!"; print $fh $html; close $fh; }
The error checking may be a <cough> tad excessive :-)

Update 3
Here's my go at extracting the data:

#!/usr/bin/perl use strict; use warnings; use HTML::TableExtract; my $html; { local $/; my $file = 'bio_output.html'; open my $fh, '<', $file or die "can't open $file: $!"; $html = <$fh>; close $fh; } my $t = HTML::TableExtract->new(); $t->parse($html); my $report = $t->tables_report(1); print $report;
output:
---------- Capture Output ---------- > "C:\Perl\bin\perl.exe" monk18.pl TABLE(0, 0): User Parameters and Scoring Information: method selected to limit number of results:explicit number number of results requested:20 HLA molecule type selected:A_0201 length selected for subsequences to be scored:9 echoing mode selected for input sequence:Y echoing format:numbered lines length of user's input peptide sequence:145 number of subsequence scores calculated:137 number of top-scoring subsequences reported back in scoring output tab +le:20 TABLE(0, 1): Scoring Results::: Rank:Start Position:Subsequence Residue Listing:Score (Estimate of Hal +f Time of Disassociation of a Molecule Containing This Subsequence) 1: 2:ALLKQSWEV:1930.068 2: 95:KITDPHFEV: 795.962 3: 108:LLGTIKEAV: 57.937 4: 107:ALLGTIKEA: 42.278 5: 21:CLFALIIEA: 42.278 6: 19:SLCLFALII: 16.254 7: 63:TICESATEL: 12.043 8: 12:KQNIPGHSL: 7.581 9: 14:NIPGHSLCL: 2.937 10: 101:FEVMKGALL: 1.911 11: 133:NQLVATIKA: 1.864 12: 35:YVFSFLKDS: 0.970 13: 45:EIPENNPKL: 0.903 14: 75:GQAVWDNNT: 0.756 15: 135:LVATIKAEM: 0.739 16: 103:VMKGALLGT: 0.737 17: 52:KLKAHAAVI: 0.524 18: 123:EMCCAWTEA: 0.457 19: 3:LLKQSWEVL: 0.434 20: 89:SIHLKNKIT: 0.420

In reply to Re: bioinformatics problem by wfsp
in thread bioinformatics problem by mutatedgene

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.