http://qs1969.pair.com?node_id=327415

JayBee has asked for the wisdom of the Perl Monks concerning the following question:

    I've updated the title: from DBI or Flat Files, Which is faster
I am working with user inputs and retrieving them for later use. Hopefully there will be lots of inputs to come, but as you may know, speed will decrease as requests increase. I want to start out by going the fastest route before it gets too late. Any suggestions on this?

Also, is there a limit on how many scripts can access the same SQL or the same File.txt?

Thank you in advance... JayBee

Replies are listed 'Best First'.
Re: DBI or Flat Files, Which is faster
by Zaxo (Archbishop) on Feb 08, 2004 at 03:58 UTC

    DBI is a wrapper for unified access to all sorts of databases, including flat files. The kind of database is determined by the connect string you use for DBI. By writing to DBI's commonest interface, you can switch db types as need arises, with minimal disruption.

    As you suggest, flat files don't scale well for speed. They are easily exported to other db types. DBI always costs a little speed, but repays the investment with easier maintainance and modification.

    After Compline,
    Zaxo

Re: mySQL or Flat Files, Which is faster
by Sol-Invictus (Scribe) on Feb 08, 2004 at 12:10 UTC

    The answer to this depends on what you want to do. If you are talking about a couple of hundred data entries then flat files will be be faster than mSQL or MySQL not because either of these servers aren't fast, but you still have to connect to and negotiate with the server to get the data.

    On larger amounts of data mSQL or MySQL will be faster even with connecting and negotiating because the way they store the data is organised in such a way as to make it fast to search through (provided of course you have set up the database to store the data for efficient retrieval) - flat files have to be first read in then parsed into data structures, which causes a bottleneck.

    Sol-Invictus

    You spend twenty years learning the spell that makes nude virgins appear in your bedroom, and then you're so poisoned by quicksilver fumes and half-blind from reading old grimoires that you can't remember what happens next.

Re: mySQL or Flat Files, Which is faster
by Trimbach (Curate) on Feb 08, 2004 at 14:16 UTC
    Yes there's some overhead for DBI connections, but unless your program can access particular files by name (that is, your program is given a specific file to access without having to do any searching) then MySQL is going to definitely be faster. For example, if you have one file per user, and each file is named after the user, and the only thing your program will ever do for User X is access File X, then flat files will be quicker.

    On the other hand, if there are multiple files per user, or you want to slice and dice the data in different ways (i.e. "Give me all the files for User X, created between January and March, that have to do with 'Foo'") then you should just go ahead and do the relational database tango and use MySQL. Although it's true that MySQL builds its internal data structures on top of whatever file system it's in, it gets HUGE speed advantages over flatfiles from the proper use of indexes, allowing you to get to just the data you need without having to search through every file in the database.

    Think about what you're doing (and what you're likely to want to do with your program in the future) and make the appropriate decision. :-)

    Gary Blackburn
    Trained Killer

Re: mySQL or Flat Files, Which is faster
by Abigail-II (Bishop) on Feb 08, 2004 at 17:32 UTC
    The question "what is faster, MySQL or flat files" is meaningless. What is faster, a car or ship? That would depend on where you are going. Over land, a car is usually faster, but over water, a ship tends to be faster.

    Which of MySQL and flat files is faster depends on what you are going to do. If it's mainly adding data, and there's not much querying going on, flat files might be a lot faster - you'd just add the information to the end. If your data is very stable, and you need to do complex queries, MySQL will win. If you have a lot of concurrent access, MySQL might win, but sometimes, flat files will win. If you do a lot of queries, where most of the time half or more of the data is needed for the result, flat files will probably faster. If you are doing a lot of random deletes, MySQL is probably faster, specially if you have the right indices. If you just have 50 rows, who cares what is faster.

    I want to start out by going the fastest route before it gets too late.
    But before you can answer the question what is the fastest route, you first have to determine what kind of data you are dealing with, how and how often the data changes, and which kinds of queries you have to perform, and what their frequencies are.

    Abigail

Re: mySQL or Flat Files, Which is faster
by zentara (Archbishop) on Feb 08, 2004 at 14:36 UTC
    I don't know exactly what you are doing, but don't forget about modules like Storable for saving db data. I'm using it with a db I'm setting up, and it works very fast. Just put your data into hashes, and Store and Retreive it. The trick is to minmize the disk-read-writes. So you might need to plan ahead so you only read the stored db you need.

    A big plus for MySQl or postgresql is that you have multi-user access built-in, although I've read the latest version of BerkeleyDB has "concurrent access".

    So I would say the main thing to consider is: "Are you planning to allow simultaneous access", otherwise you should think about file locking.

Re: mySQL or Flat Files, Which is faster
by CountZero (Bishop) on Feb 08, 2004 at 15:07 UTC
    Speed is of course a major thing to consider, but equally (if not more) important is the matter of concurrent access to your data-files.

    Flat files not living within a database server allow multiple readers but cannot allow multiple concurrent writers.

    Database servers (such as MySQL or PostgreSQL) allow such multiple read and write access.

    If you take that into account, there is no choice: only a real database server will do.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: mySQL or Flat Files, Which is faster
by blue_cowdawg (Monsignor) on Feb 08, 2004 at 16:18 UTC

        I want to start out by going the fastest route before it gets too late.
        ....
        Also, is there a limit on how many scripts can access the same SQL or the same File.txt?

    Concurrent access of a flat file

    This in my opinion is always problematic. If you have multiple scripts reading the same file then there is no issue but if by concurrent access you are talking about a read-modify-write situation then the issues with flat files are legion. If you have a RDBMS available to you use it.

    Databases concurrent access

    How many concurrent accesses for database XYZ? The answer: it depends. Database scalablity is a mult-variable equation. Some of those variables have to do with the architecture of the database itself and others with the available bandwidth to the database instance, architecture of the hardware and other factors. The kinds of stuff that in my profession (Unix Engineer) cause me worry wrinkles and loss of sleep.

    A rule of thumb answer for you: If you are talking dozens of concurrent users then probably MySQL will work for you just fine. If you're talking thousands then you might consider something more heavyweight. In fact you might even be talking about clustering at that point. Perl Monks works off of PerlCGI and MySQL I'm told and mostly handles the load just fine and you have many more than "dozens" of concurrent users on this site I'm sure.

    From what you've stated in your post there really isn't enough information for me to give you an in-depth professional answer, but if I have a choice between using flat files and a database I'll go with the database every time.


    Peter L. Berghold -- Unix Professional
    Peter at Berghold dot Net
       Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.
Re: mySQL or Flat Files, Which is faster
by gmpassos (Priest) on Feb 08, 2004 at 23:09 UTC
    If you want something between a server DB and flat files, take a look at SQLite (DBD::SQLite).

    Talk about speed and DB is not simple. For example, if you make your own flat system, to store your data, for a specific use, with a goo data structure approach, maybe you can have something faster and light than a true DB.

    But speed depends also in how you access a DB. For example, if I want just the register N of my flat file, this will be fast. But if you want to make a complex search in your data, a DB exists for that.

    But I still recomend SQLite. If you don't want to use a server DB, like MySQL, but still have SQL queries, tables, and speed, SQLite works very well. The only limitation of SQLite is that it won't work in multiple systems, soo, you can't have in the future a DB system that work with multiple servers. With SQLite you only can have one machine working with the file. Unless you make some network to share a HD/directory of the SQLite file.

    Unsing SQLite you also will be still using DBI, soo, if in the future you need to change the DB will be easier.

    Graciliano M. P.
    "Creativity is the expression of the liberty".