Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^3: Finding Line numbers in a file

by Rhandom (Curate)
on Apr 04, 2007 at 16:09 UTC ( [id://608314]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Finding Line numbers in a file
in thread Finding Line numbers in a file

You are possibly right. You are just as possibly wrong. There are several things that we don't know, such as:
  • Average line length. Shorter lines means more lowlevel iterations.
  • Average file length. Longer files will require more memory - but that is about all.
  • Average hit count. How often is the string found in the file.
  • Average hit placement. How often does the string end up at the beginning or the end.
  • Implementation issues. Is the string passed in already in one chunk or do we have access to a file handle.
There are just too many unknowns to use blanket statements as to which algorithm is best.

But one thing that is a major issue is that the special regex capture variables shouldn't be used. They impose too much penalty. Instead though you can use @- and @+ which have no penalty. As in the following:

my $str = "1 one 2 two 3 one 4 four 5 one 6 five"; my $last_pos = 0; my $newlines = 1; while ($str =~ /(one)/g) { $newlines += substr($str, $last_pos, $-[0] - $last_pos) =~ tr/\n// +; $last_pos = $-[0]; print "Found on line $newlines\n"; } # prints # Found on line 1 # Found on line 3 # Found on line 5


Notice the optimization that only counts newlines from the previous match.

my @a=qw(random brilliant braindead); print $a[rand(@a)];

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://608314]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-04-25 09:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found