How do I backtrack while reading a file line-by-line?

my @array = qw/ foo bar baz blah bar blah baz/;

my $save = 0;
my %done;
for (my $x = 0; $x <= $#array; $x++) {
  $save = $x if ($array[$x] eq 'bar'  );
  print "X:$x SAVE:$save $array[$x]\n";
  if ( $array[$x] eq 'blah' and !defined($done{$x}) ) {
    $done{$x}++;
    $x = $save;
  }
}
[download]

grep

One dead unjugged rabbit fish later

[reply]
[d/l]
[select]

Re^2: How do I backtrack while reading a file line-by-line?

by ikegami (Patriarch) on Oct 13, 2006 at 20:42 UTC

That section on memory usage is very misleading. Tie::File keeps the index of every encountered lines (i.e. every lines up to the highest one read/written) in memory. In other words, if you do $tied[-1] or push @tied, ..., the index of every line in the file is loaded into memory (if they haven't already been loaded).

Tie::File is still a very useful module.

[reply]
[d/l]
[select]

Re^3: How do I backtrack while reading a file line-by-line?

by grep (Monsignor) on Oct 13, 2006 at 21:26 UTC

POD

memory - This is an upper limit on the amount of memory that Tie::File will consume at any time while managing the file. This is used for two things: managing the read cache and managing the deferred write buffer

I didn't find that misleading. It says to me that only chunks of the file data are loaded into memory. In fact, I assumed that it loaded a full index of the lines at instantiation.

If the OP knows about how much data an average (or the largest) backtrack is, the read cache could optimized for memory usage/speed. Plus you get a layer of abstraction to hide any nastiness.

grep

One dead unjugged rabbit fish later

[reply]

Re^4: How do I backtrack while reading a file line-by-line?

by ikegami (Patriarch) on Oct 14, 2006 at 07:26 UTC

Re: How do I backtrack while reading a file line-by-line?
by madbombX (Hermit) on Oct 13, 2006 at 19:37 UTC

push

That being said, to add onto ikegami's idea, you can use tell to tell you where the line is, push that on an array. Then when you want to go back X number of times, then you can always seek to the line ($lines[-1] .. $lines[-4]).

[reply]
[d/l]

Re^2: How do I backtrack while reading a file line-by-line?

by Fletch (Bishop) on Oct 13, 2006 at 20:46 UTC

Keeping your own buffer also has the advantage of working on something that's not seekable (e.g. a network socket, or a pipe from another program).

[reply]

Re^2: How do I backtrack while reading a file line-by-line?

by mdunnbass (Monk) on Oct 18, 2006 at 20:14 UTC

Thanks anyway tho.
Matt

[reply]

Re: How do I backtrack while reading a file line-by-line?
by BrowserUk (Patriarch) on Oct 13, 2006 at 21:41 UTC

Sounds very much like you're trying to read a Fasta format sequence file?

You could use Bio::SeqIO, or if that is giving you problems you might try my crude Fasta load routine. It's the last code snippet in Re^5: Memory Usage in Regex On Large Sequence. That post/thread also shows a problem with the cpan module along with one reason why it's performance is not so good. Though that might have been fixed by now.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]

Re^2: How do I backtrack while reading a file line-by-line?

by mdunnbass (Monk) on Oct 18, 2006 at 20:18 UTC

In fact, what I'm doing is creating a search function that, given a variable number of user-input DNA sequences (such as amino acid motifs, or transcription factor binding sites), it searches a user specified Fasta file for all hits, either totally, or within $interval bases of each other, and then outputs all the hits both as .html format and as .fasta format, and the .html would have all the matches highlighted in various colors.

So far, I haven't read up at all on modules, so I suppose that's the next step in my Perl learning curve.

Thanks for the pointer. I'll definitely check it out.
Matt

[reply]

Re: How do I backtrack while reading a file line-by-line?
by holli (Abbot) on Oct 13, 2006 at 19:55 UTC

redo

perlfunc

holli, /regexed monk/

[reply]
[d/l]

Re: How do I backtrack while reading a file line-by-line?
by blazar (Canon) on Oct 14, 2006 at 10:17 UTC

Nothing to do with your question, but...

while ($newline = <FILEHANDLE>){
[download]

use strict;
use warnings;
[download]

and then

while (my $newline = <FILEHANDLE>){
[download]

if ($newline = /^>/) {
[download]

This is most probably not what you want, since you're assigning to $newline. You want

   if ($newline =~ /^>/) {
[download]

instead.

$stuff = $newline; &play_with($stuff);
[download]

Unless play_with() modifies its argument, you may want to pass $newline directly to it, without passing through an intermediate variable. But more importantly, the &-form of sub call is now obsolete and likely not to do what one may think, so unless you do know, don't!

[reply]
[d/l]
[select]

Re^2: How do I backtrack while reading a file line-by-line?

by mdunnbass (Monk) on Oct 18, 2006 at 20:21 UTC

I didn't know about the & form being deprecated, so i will get rid of that.

And &play_with($stuff) does indeed modify $stuff, so I guess I am doing the right thing there, although I can't take credit for doing it on purpose. ;)

Thanks for the info though.
Matt

[reply]

Re^3: How do I backtrack while reading a file line-by-line?

by mdunnbass (Monk) on Oct 18, 2006 at 20:24 UTC

When I wrote the code block in the original post, the

if($newline = /^>/) {
[download]

if($newline =~ /^>/) {
[download]

[reply]
[d/l]
[select]