Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi.

I got a flat file of about 71KB and 500 lines. It keeps growing all the time, since it's a user database. I got some scripts that process it to authenticate users.

What would eat up less system resources? Doing a sequential scan on the file: example:

open(DB, 'one.txt') or die "Error";
while(<DB>) {
chomp $_;
if($_ eq $whatever) { $info = $_; break; }
else { next; }
}
close(DB);

Or loading it to an array, and the processing it: example:

open(DB, 'one.txt') or die "Error";
@db = <DB>;
close(DB);

foreach $i (@db) {
chomp $i;
if($i eq $whatever) { $info = $i; break; }
else { next; }
}

?

Thanks,
Ralph.

Replies are listed 'Best First'.
Re: What eats up less resources?
by Biker (Priest) on Dec 11, 2001 at 19:15 UTC

    Think of scalability. Your file will grow. Slowly or quickly depends on your situation.

    Program for the future.

    I dislike reading in a 'small' file into RAM, because one day (when I'm on vacation?) that file will be really big. I dislike doing anything that relies upon any assumptions made on input data. Cause input data is unreliable in it's nature.

    f--k the world!!!!
    /dev/world has reached maximal mount count, check forced.

Re: What eats up less resources?
by strat (Canon) on Dec 11, 2001 at 18:50 UTC
    What kind of system ressources do you think about? RAM? Cpu usage? Runtime?

    The first one might need less RAM because it doesn't seem to slurp the whole file into RAM as the second one does.

    Finding out about runtime, I'd have to do some benchmarking, but I think, for small files the latter example might perform slightly better...

    Best regards,
    perl -e "print a|r,p|d=>b|p=>chr 3**2 .7=>t and t"

Re: What eats up less resources?
by trantor (Chaplain) on Dec 11, 2001 at 18:47 UTC

    Try it yourself with benchmark :-)

    Reading record by record is buffered anyway, so it is my opinion that, since your file grows in time, slurping it in a single read would use much more memory in the long run.

    -- TMTOWTDI

Re: What eats up less resources?
by archen (Pilgrim) on Dec 11, 2001 at 22:53 UTC
    They key thing here is "It keeps growing all the time". Doing it line by line will be far slower, but will probably tax the system less overall. If you don't know how big this thing will get, the system could take a major RAM hit loading it into an array. Personally I only load files into an array when I can be sure that they are small (like config files), or I'm at home being lazy on my own machine.

    In some other notes, if you load it into an array, just chomp the whole array like ' chomp(@db) '. I'm also fairly sure that chop would be faster than chomp (by maybe a cycle or two), and if you're sure that there is a newline as a separator chop is pretty safe.
Re: What eats up less resources?
by jlongino (Parson) on Dec 11, 2001 at 21:57 UTC
    As others have pointed out, you should try benchmarking some test code if execution speed or efficiency are your concerns. Some benchmarking examples can be found through Super Search. I posted a Benchmarking Quiz not long ago that has some examples.

    However, you should also take into consideration your available resources and their relative strengths and weaknesses (e.g., slow disk, i/o, RAM available, CPU utilization, etc.). If you have scads of RAM and CPU cycles, slurp and burp!

    Most of my "work" perl code is on a Sparc Enterprise 450 with 1GB of RAM that is primarily an E-mail server, so I routinely slurp text files 5MB+ with no noticeable impact on performance or resources. It's relative to your environment.

    --Jim

Re: What eats up less resources?
by strat (Canon) on Dec 11, 2001 at 22:39 UTC
    A sidestep to optimizing: if you really want to get fast code, try to find out the pieces of code that are run most often (maybe with the Module Devel::DProf) and try to optimize the most often called functions, maybe even with translating them to C if possible and usefull.

    In common, I don't optimize for runtime or memory usage; most of the time it is much cheaper to buy more RAM or get a faster CPU. Sometimes I just have to optimize, maybe because of certain borders like the memory border of about 2 GB/process with 32bit operating systems or the like.

    Best regards,
    perl -e "print a|r,p|d=>b|p=>chr 3**2 .7=>t and t"

Re: What eats up less resources?
by Anonymous Monk on Dec 11, 2001 at 19:19 UTC
    Ok, thanks for your replies. And, for what I'm doing: user authentication on a database that keeps growing, which one will put less load on the server (ram, cpu, everything) when processing the db?

    Ralph :)
Re: What eats up less resources?
by Spudnuts (Pilgrim) on Dec 12, 2001 at 02:12 UTC
    Could you use a DBM file instead of a flat text file? Examples of this are in Chapter 1 of the Llama book.