in reply to Re: mismatching characters in dna sequence
in thread mismatching characters in dna sequence
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: mismatching characters in dna sequence
by Eliya (Vicar) on Dec 30, 2011 at 16:29 UTC | |
Which method are you talking about, with respect to those 5 seconds? The XOR method outlined above takes just ~0.05 secs on my 4-year-oldish machine, for 10,000 comparisons against a common 40-char target (with ~3-5 deviations per sequence):
Storing away the results somewhere or doing something else with them will presumably take considerably longer than computing them... | [reply] [d/l] |
by prbndr (Acolyte) on Dec 30, 2011 at 17:13 UTC | |
| [reply] |
|
Re^3: mismatching characters in dna sequence
by BrowserUk (Patriarch) on Dec 30, 2011 at 17:33 UTC | |
Character by character processing of strings is the single biggest weakness in perl's arsenal. As you've already used PDl, you probably have a compiler, so dropping into Inline::C should not be a problem for you. The two routines below differ in the way the return the results. dnacmp() returns a list of strings like this:"123:A:C", whereas dnacmp2() concatenates all those into a single string for return:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
by prbndr (Acolyte) on Dec 30, 2011 at 20:31 UTC | |
ok, i think inline::c is a little beyond me. i'm trying it out with the following code:
first, i extract some information from the .bam file. then, i create the target and the test sequence on the fly in the subsequent for loop. $ref represents the target and $query represents the test (both are just strings). i then try to feed these two variables into dnacmp2 and print a few more things before it ($read->qname represents a sequence identifier), but it throws the following error: sorry if this is a rookie question, but what's going on with this? | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Dec 30, 2011 at 20:42 UTC | |
Hm. The first and most obvious problem is you've forgotten my here:
A more subtle potential problem is not having the shebang line as the first line of the file:
I don't know what that first line is meant to do, but either delete it or move it below the Inline C code. Make those two corrections and then see what happens. You do have a C compiler correctly installed don't you? With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
by prbndr (Acolyte) on Dec 30, 2011 at 20:58 UTC | |
by BrowserUk (Patriarch) on Dec 30, 2011 at 21:02 UTC | |
| |