in reply to Re: Moving from SQL to flat-files
in thread Moving from SQL to flat-files

Popular alternatives are mySQL and PostgreSQL, both in the free realm. </blockquot>

Choice of rdbms was not the issue... I already had a pretty fine engine in SQLite, and it was doing my job well. The immediate issue was portability. There is a certain level of complexity required with maintaining a database. In the case of SQLite, I had to usually upgrade my <cpan://DBD::SQLite> whenever that was upgraded, because it gave to me whatever security, quality, or performance enhancements the underlying SQLite brought. Then one day, bam!, the SQLite format became binary incompatible with earlier versions. I needed the new version because VACUUM was giving me an error, and that was fixed in the new version. But, my database suddenly became "different." I fixed it all on my computer, but my hands were tied as far as my web host was concerned. They wouldn't upgrade the driver until it was deemed "stable" for them (understandable, but it boofed me).

The other issue often cited is scalability and concurrent use. I am not sure how much of this is just accepted wisdom (folklore?) but somehow it seems folks accept that flat files don't scale well. Concurrent use is definitely an issue, but not for me right now... I am designing a primarily single user application. Even when it becomes multiple user, I am not talking about an Amazon.com here. Can a flat files driven app handle a few thousand hits a day? If yes, I am set. Of course, the "database war" stories on radar.oreilly.com (link in the OP) mention websites with much higher use and content. So, perhaps scalability is myth.

Then there is the issue of accessibility (defined my way). With SQL, all the data are hidden in the SQL engine. They are only accessible through an application, be it my web app, or some other interface. With flat files, I can open them with just a text editor. Push comes to shove, I can ssh into my web host, fire up vim, and edit a file, and the changes will show up in my app.

The last issue is of a "challenge." I have been quite taken by the "simplest thing that could work" bit (did Ward Cunningham postulate that?). So, I've set out to see if I can redo the app with flat files. Yes, redoing the SQL into non-SQL logic is quite some work, but only becacuse I had trained myself to think the SQL way.

SQL engines are very important, and there is a lot dogwork hidden inside of them. But, are there times when they are simply unnecessary? I have actually learned all sorts of neat programming techniques since I started battling converting my app to flat files.

Hope this makes sense. But I am also learning from all the different viewpoints I am hearing here.

--

when small people start casting long shadows, it is time to go to bed

Replies are listed 'Best First'.
Re^3: Moving from SQL to flat-files
by perrin (Chancellor) on May 10, 2006 at 20:22 UTC

    So you had a bad upgrade experience with SQLite and you're solving it by writing your own database? Sounds like you're throwing the baby with the bathwater to me.

    Can a flat files driven app handle a few thousand hits a day?

    Safely? Without corrupting the data or losing updates? Yes, but only if you put lots of work into it. Locking is not easy. You can get something fairly safe if you carefully use exclusive locks, but it won't scale very far before people end up getting blocked when trying to update. Maybe enough for you, maybe not.

    Of course, the "database war" stories on radar.oreilly.com (link in the OP) mention websites with much higher use and content. So, perhaps scalability is myth.

    No! Don't draw the wrong conclusion from that series. For one thing, those people are usually talking about something totally different from what you're doing when they say "flat-files." Google doesn't do anything remotely similar to flat-files. They wrote a filesystem! They wrote a distributed, indexed storage system with locking and logging and most everything else RDBMSes have. One guy was using BerkeleyDB and calling it flat-files. Some of them are just storing data that is compiled off-line and never updated by the application, so they have no concurrency issues. Some of them are talking about data that they just read all of and never query against. You can't pull any general ideas about the scalability of flat-files as an RDBMS replacement from that O'Reilly series.

Re^3: Moving from SQL to flat-files
by Tanktalus (Canon) on May 10, 2006 at 13:33 UTC

    What I kind of meant was that these other databases may provide better migration stories. Migration from one version to the next, or migration from one platform to another (although I suspect MS SQL may not be so good at that ;->).

    As far as SQLite fixing a problem that you couldn't deploy on your server - which is worse: not being able to deploy a known fix, or not being versed enough in ACID transaction compliance and implementation to provide a secure method to ensure no loss of data in your own private modules? Personally, I'd rather stick with SQLite and force my provider to upgrade (or move to a new provider) than try to fight with unknowns which I won't discover until I've already lost data.

    Part of the scalability of "real" database servers is that you can scale your webserver and your database server independantly. At first, they live on the same box. Then you get too many hits, so you move your database to a second box. If you're using flat files, you can't quite do this since you can't offload your data retrieval/saving to a second box - you need to use that second box as a second webserver, but use NFS for sharing the data. Now things get a bit messy - locking across NFS isn't nearly as safe. As things move on, you get more and more webservers just to handle the traffic. At some point, you may have so much going on that you actually need a second *database* server. At this point, you'd probably need to nearly double your webservers for flat files. But, because you've offloaded so much of that work to a machine with lots of RAM (for buffering), and fast disks, you only need to duplicate the single machine. And now you need something with clustering capabilities to manage the data.

    I realise that SQLite isn't a client/server database. But, because you've already built all your logic with SQL, it does provide a great starting point to move to a client/server database system should the need actually arise. Like the rest of the above scalability, you may never need it. But, should you actually need it, you'll be kicking yourself if it's not built in from the ground up.

    Flat files bring with them so many of the concurrency and data integrity issues that databases have solved that I would simply never want to store data in something that hadn't solved those issues for me.