alicia:
It's just possible that some of us are unfamiliar with BIAST, BESTHIT, E-Value and uncertain precisely what "the identies" means in this context. Oh, /me fits that category. It's also possible someone else is expert on all the named whatcha'ma'call'its, and thus able to provide assistance without you supplying the prerequisites for a good question here.
But it's a good idea to satisfy those anyway. They can be found at On asking for help and How do I post a question effectively?. Briefly, they include the code you've written, a data sample, and an unambiguous explanation of why your results don't satisfy you.
Your SoPW doesn't tell us whether you know how to open and read a file(s?); how to assign content to variables; or how to write a regex. For the first, see perldoc open and any number of nodes here dealing with read, while, <> which can be found using Super Search. Generally, that same Super Search will offer examples of assignments to $vars. And for regex help, start with perldoc perlre and perldoc perlretut and the splendid tuts here at the Monastery.
The fact you found BioPerl is commendable, but you're going to need to do more than that.. and, in any case, your statement that you "need to do the question without BioPerl" raises other questions; first, "Is this homework or the like?" and second, "Why?"
| [reply] [d/l] [select] |
Unless you show how your BLAST report looks like (Tabular, line-wise...etc) by enclosing a subset of the report in here we are not going to be able to 'guide' you best. While BioPerl can be a way to go, Boulder::Blast can be another option.
Whether or not to use regular expressions can not be absolutely ruled in/out because sometimes you require to combine the parsing abilities of the module you use with the prowess of regular expressions when parsing such sequence or blast objects. So without data we're just punching in the dark.
NOTE: You seem to be using a library 'BeginPerlBioinfo', where is that coming from, did you read its documentation already?
also you need to read Markup in the Monastery and Perl Monks Approved HTML tags.
Excellence is an Endeavor of Persistence.
A Year-Old Monk :D .
| [reply] [d/l] |
Why on earth would you want to do that without BioPerl? Parsing these reports is not trivial for a beginner (although the tabular output is not TOO difficult to handle) and BioPerl does all that for you already - why re-invent the wheel? I'm afraid there is no "subroutine that will help with this" other than the methods of Bio::SearchIO, which do all you need for you and are quite straight-foraward to use. If you are having trouble using those then please post your code and I'm sure we can help you gettng started.
| [reply] |
There are several versions and implementations of BLAST. Can we assume you use NCBI's blast+ :
ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ ?
There is a user_manual.pdf there, which specifies the possible output formats.
What output do you have to parse? Did you notice that BLAST can output in a table format (option -outfmt)? With that you can pipe the blast-produced alignment data straight into a database table, or, of course, into a perl program. BLAST+'s table-format makes parsing rather trivial.
Can you let us know which implementation and version of BLAST you use? Show the actual output? Nobody can guess what your output looks like... Maybe also tell us whether it's homework or not (because of the no-bioperl condition)?
| [reply] |
I recommend reading
Beginning Perl for Bioinformatics. Along with the book the author provides a module BeginPerlBioinfo.pm which has subroutine extract_HSP_information showing how to parse BLAST result. This is a good starting point for your task. | [reply] [d/l] [select] |
Yes, that is THE book for learning Perl for bioinformatics - well written and definitely a "must have" for the beginner. But I would add that it's only a good starting point if you want to a) follow the lesson for the purpose of learning Perl or b) really, really can not possibly use BioPerl in your environment. In all other cases, BioPerl is the way to go.
| [reply] |
To pull a common thread from the above posts:
I need to do the question without BioPerl.
95% of the time such statements are completely wrong. The rest of the time it is somebody trying to hide the fact that it is a homework problem.
Tell us why you think you can't use it, and we'll tell you how to make it work despite the limitations imposed on you.
| [reply] |