Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi.

Currently, I'm running a flat file user database delimited by pipe chars ('|'). All I do on this file is append new records and authenticate user logins. Still, on many scripts I load the full file into lists (arrays) for as to process it. Now, the file is standing at 71KB in size. My question is...

When will I reach the limit in which I'll have to migrate this database to MySQL?

Thanks,
Ralph.

Note: I know that I should've used MySQL from the start. I just wasn't able to do this due to past limitations on the server.

Replies are listed 'Best First'.
Re: Text Handling Limit
by dws (Chancellor) on Dec 09, 2001 at 00:42 UTC
    When will I reach the limit in which I'll have to migrate this database to MySQL?

    A 71KB flat file database is pretty small by today's standards. If it's a single-user database that you aren't performing complex queries against, you are potentially fine up through several megabytes.

    But there are reasons beside size to consider moving to an RDBMS. For one, you get the benefits of indexing (fast record lookup) and SQL to allow you to express complex queries. Perhaps not an issue if you're never going to grow past a single table, but something you'll need as you split into multiple tables. Next up is concurrent access control, which you can do by hand (via flock()) for a single table database, but which is a major nuisance once you get past one table. And then there's the ability to roll back a transaction. Again, not a problem for single table flat file database, but a big win when you're doing something involving multiple people doing work simultaneously against multiple tables.

    There are also costs: RDBMSs require a bit of care and feeding that flatfiles don't. This varies by RDBMS; MySQL isn't too bad.

Re: Text Handling Limit
by TomK32 (Monk) on Dec 09, 2001 at 00:53 UTC
    Short answer: When your server gets slow.

    A bit longer: The advantage of flat files is that they might (I never did a benchmark) be faster, depending on size and the server's internal throughput on the one side and the database on the other side.
    If you configure your server the way it keeps the user-flatfile cached the flat-file should be faster.

    You also said only a few scripts use the database file, try to decrease the number once more. If you really don't want to hack into SQL you can also try to split up your textfile, seldom used stuff like a user-bio can be moved into seperate files (for each user one file, or maybe for all users starting with the same letter -> ~36 files) or much better into a static file which gets updated every hour/day.

    There are various ways to get around SQL, personally I prefer SQL (cuz I don't have to pay for the database space) but I also like the text files (like in my diary) because I can update them a lot more comfortably.
    It depends on the way you wanna do it and on your server

    -- package Lizard::King; sub can { do { 'anything'} };
Re: Text Handling Limit
by grinder (Bishop) on Dec 09, 2001 at 03:54 UTC
    All I do on this file is append new records and authenticate user logins. Still, on many scripts I load the full file into lists (arrays) for as to process it.

    My question is...

    Do you really need to pull all the records in at once? Do you need to examine all the records simultaneously before coming to a decision, or do you need to perform two or more passes through the file before coming to a decision?

    Honestly, I doubt that is the case. I'd bet a beer or two that you could get by with a sequential scan, which means you wouldn't have to load everything into an array. i.e.:

    open IN, 'db.txt' or die "cannot open db.txt for input: $!\n"; while( <IN> ) { chomp; my @fields = split '|'; ... } close IN;

    If you need to sort the "database", use the operating system's sort command (assuming you're on anything but Win32). If you need to perform multiple passes, just seek back to the beginning of the file and read it again, and rely on the fact that the OS will probably have buffered it in RAM.

    If you can avoid pulling everything in to RAM as much as possible, you'll be able to live with a flat file all that much longer. Flat files are cool: you can debug them with vi!

    And the day you need to get a real database, don't just think that means MySQL. Postgres has a lot going for it (and the version 7 that has been out for a few years now is a hell of a lot easier to install than previous versions).

    --

    g r i n d e r
    just another bofh

    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u';
Re: Text Handling Limit
by jlongino (Parson) on Dec 09, 2001 at 03:04 UTC
    Both dws and TomK32 give useful suggestions. Usually, the number of records is not the primary concern when deciding whether or not to change from a flat-file to a RDBMS.

    The limitations of a flat-file database are realized more often by having to store a great deal of information per record. Particularly if you need to store complex data like arrays or hashes.

    In general, most texts describe flat-file databases as an ideal solution for implementing user authentication schemes. Primarily because they are fast and don't require complex processing. Using a RDBMS for this type of task would probably be overkill unless you are experiencing performance problems (which you shouldn't if your system is as simple as you describe).

    --Jim