Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re^2: Finding Line numbers in a file

by reasonablekeith (Deacon)
on Apr 04, 2007 at 15:29 UTC ( #608299=note: print w/replies, xml ) Need Help??

in reply to Re: Finding Line numbers in a file
in thread Finding Line numbers in a file

Nice, but inefficient, and gets worse the bigger the text file is.

Do not do this, use the others, they increase in a linear proportion with the size of the text file, and do not require entire file to be loaded into memory.

my name's not Keith, and I'm not reasonable.

Replies are listed 'Best First'.
Re^3: Finding Line numbers in a file
by Rhandom (Curate) on Apr 04, 2007 at 16:09 UTC
    You are possibly right. You are just as possibly wrong. There are several things that we don't know, such as:
    • Average line length. Shorter lines means more lowlevel iterations.
    • Average file length. Longer files will require more memory - but that is about all.
    • Average hit count. How often is the string found in the file.
    • Average hit placement. How often does the string end up at the beginning or the end.
    • Implementation issues. Is the string passed in already in one chunk or do we have access to a file handle.
    There are just too many unknowns to use blanket statements as to which algorithm is best.

    But one thing that is a major issue is that the special regex capture variables shouldn't be used. They impose too much penalty. Instead though you can use @- and @+ which have no penalty. As in the following:

    my $str = "1 one 2 two 3 one 4 four 5 one 6 five"; my $last_pos = 0; my $newlines = 1; while ($str =~ /(one)/g) { $newlines += substr($str, $last_pos, $-[0] - $last_pos) =~ tr/\n// +; $last_pos = $-[0]; print "Found on line $newlines\n"; } # prints # Found on line 1 # Found on line 3 # Found on line 5

    Notice the optimization that only counts newlines from the previous match.

    my @a=qw(random brilliant braindead); print $a[rand(@a)];
Re^3: Finding Line numbers in a file
by sanPerl (Friar) on Apr 04, 2007 at 15:51 UTC
    Dear kyle and reasonablekeith,
    Thanks for suggestion and warning also. This is making me think in new directions.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://608299]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (2)
As of 2022-08-18 05:43 GMT
Find Nodes?
    Voting Booth?

    No recent polls found