jkmiller has asked for the wisdom of the Perl Monks concerning the following question:

Save me from moving back to Mathematica to
do all my PhD work!! I'm not trusting perl at all at this
point! :)

Thanks to all who posted replys to my first post
last Fri. Also to all those who chatted with me
as to possibilities as to why my code wasn't working.
Unfortunately, none of the comments seemed to clear
up the bizare behaviour. Perl still is telling
me that a "t" appears earlier than it really does.

I would really appreciate someone downloading my code, and
using it exactly as it is written, and with the seperate
input file (the 80 chars + ?newline? that you can see
in the output)
one.seq.one.line.7.19.04
(vs. quoting it and putting it right in the
code itself). If someone would do that, and respond back
as to what they got, I would really appreciate it. I'd like
someone else to verify it is not just my machine!. I think
someone else may get hooked on this as well -- to me at
least it is just truely bizarre (oh yeah, and frustrating
as ____!).

One last note.

The input line consists of 80 chars (and then I guess
a new line). When I open it with vi, it says:
"one.seq.one.line.7.19.04" 1L, 81C
so everything seems correct.

Here's another bit of weirdness. As I start deleting
away charactes, things go from weird (different results as
to where the first "t" is found in the two passes through the
input file) to normal (same results as to where the first "t"
is found in the two passes through the input file).

delete first 30 char weird
delete additional 5 char normal


start anew:
delete first 32 char weird
delete additional 1 char weird
delete additional 1 char weird
delete additional 1 char normal

start anew:
delete first 34 char weird
delete additional 1 char normal

Thanks much to all who have comments -- and
again, I'd love to have someone run my identical
code and see if they get the same resuls.
I'm running:
$ perl -v

This is perl, v5.8.0 built for i386-linux-thread-multi
(with 1 registered patch, see perl -V for more detail)



Here is my code and output:

#!/usr/bin/perl -w + + open (IN, "one.seq.one.line.7.19.04") or die "can't open IN: $!"; open (OUT, ">out.index.7.19.04") or die "can't open OUT: $!"; + + while(<IN>){ + + print OUT $_; print OUT scalar(localtime), "\n"; print OUT "The first occurance of little a is: ", index($_,"a"), +"\n"; print OUT "The first occurance of little c is: ", index($_,"c"), +"\n"; print OUT "The first occurance of little g is: ", index($_,"g"), +"\n"; print OUT "The first occurance of little t is: ", index($_,"t"), +"\n"; print OUT "\n\n"; + + print OUT $_; print OUT scalar(localtime), "\n"; print OUT "The first occurance of little a is: ", index($_,"a"), +"\n"; print OUT "The first occurance of little c is: ", index($_,"c"), +"\n"; print OUT "The first occurance of little g is: ", index($_,"g"), +"\n"; print OUT "The first occurance of little t is: ", index($_,"t"), +"\n"; print OUT "\n\n"; + + print $_; print "The first occurance of little a is: ", index($_,"a"), "\n" +; print "The first occurance of little c is: ", index($_,"c"), "\n" +; print "The first occurance of little g is: ", index($_,"g"), "\n" +; print "The first occurance of little t is: ", index($_,"t"), "\n" +; print "\n\n"; + + }
~ Notice the difference in the output regarding where the first occurance of
little t is.
The output to screen is "normal" as well.
ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg Mon Jul 19 10:39:31 2004 The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 49 + + + + ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg Mon Jul 19 10:39:31 2004 The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 55
Thanks so much!

Edit by castaway - moved code tags around actual code.

Replies are listed 'Best First'.
Re: Continueing probles w/ "index" -- more info
by danielcid (Scribe) on Jul 19, 2004 at 18:09 UTC

    The output from here:
    bash-2.05b$ cat one.seq.one.line.7.19.04 ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg

    bash-2.05b$ perl tt1.pl ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 55

    bash-2.05b$ cat out.index.7.19.04 ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg Mon Jul 19 14:08:46 2004 The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 55 ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg Mon Jul 19 14:08:46 2004 The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 55


    Everything seems fine...

    -DBC
Re: Continueing probles w/ "index" -- more info
by Joost (Canon) on Jul 19, 2004 at 18:17 UTC
    Please format your code (and your question) better, next time.

    Running your code with just the one line in the input file gives on stdout:

    ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 55
    and in out.index.7.19.04:
    ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg Mon Jul 19 20:20:27 2004 The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 55 ATGGACTGCACCTGGAGGATCCTCTTCTTGGTGGCAGCAGCTACAGgcaagagaatcctgagttccaggg +ctgatgaggg Mon Jul 19 20:20:27 2004 The first occurance of little a is: 48 The first occurance of little c is: 47 The first occurance of little g is: 46 The first occurance of little t is: 55
    So it works correctly for me, using perl 5.8.5 RC2 and 5.8.4 on debian linux. Your particular perl binary might be broken; you seem to have an "unoffical" release - what does perl -V give after the "Locally applied patches:" line?

Re: Continueing probles w/ "index" -- more info
by Not_a_Number (Prior) on Jul 19, 2004 at 18:35 UTC

    I get the correct output too (on AS v 5.61 for WinXP). Try updating your perl as Joost suggests.

    If you still get strange results, maybe your input file is corrupt in some way? I doubt if this would work, but try putting:

    s/[^acgt]//ig;

    at the top of your while loop.

    dave