Re^2: Finding Line numbers in a file

Nice, but inefficient, and gets worse the bigger the text file is.

Do not do this, use the others, they increase in a linear proportion with the size of the text file, and do not require entire file to be loaded into memory.

---
my name's not Keith, and I'm not reasonable.

Comment on Re^2: Finding Line numbers in a file

Replies are listed 'Best First'.
Re^3: Finding Line numbers in a file by Rhandom (Curate) on Apr 04, 2007 at 16:09 UTC
You are possibly right. You are just as possibly wrong. There are several things that we don't know, such as: Average line length. Shorter lines means more lowlevel iterations. Average file length. Longer files will require more memory - but that is about all. Average hit count. How often is the string found in the file. Average hit placement. How often does the string end up at the beginning or the end. Implementation issues. Is the string passed in already in one chunk or do we have access to a file handle. There are just too many unknowns to use blanket statements as to which algorithm is best. But one thing that is a major issue is that the special regex capture variables shouldn't be used. They impose too much penalty. Instead though you can use `@-` and `@+` which have no penalty. As in the following: `my $str = "1 one 2 two 3 one 4 four 5 one 6 five"; my $last_pos = 0; my $newlines = 1; while ($str =~ /(one)/g) { $newlines += substr($str, $last_pos, $-[0] - $last_pos) =~ tr/\n// +; $last_pos = $-[0]; print "Found on line $newlines\n"; } # prints # Found on line 1 # Found on line 3 # Found on line 5` [download] Notice the optimization that only counts newlines from the previous match. my @a=qw(random brilliant braindead); print $a[rand(@a)];	[reply] [d/l] [select]
Re^3: Finding Line numbers in a file by sanPerl (Friar) on Apr 04, 2007 at 15:51 UTC
Dear kyle and reasonablekeith, Thanks for suggestion and warning also. This is making me think in new directions.	[reply]