in reply to Which is the better option?

In a flat-file world, I think it depends what you plan to do with the data.

If you want to show all of the grades at the same time I would use a single text file and read each student to build the results. You can easily do comparisons, averages, search for grade ranges, etc. this way. You can also easily manually type up and edit this single text file.

If you want each student to access only his own grades I would use the multiple text files approach. Each student can also easily update information in his file this way, without reading and writing everything. Each file will have to have a unique name, but even if you implemented a retrieve by name function on a single text file you would have an issue if two student shared the exact same name. You should write a script to generate these multiple files from a master file, or even better, some sort or management script to oversee the creation and updating of the contents. This may be helpful:
sub readnames { opendir THISDIR, '../students'; @allnames = readdir THISDIR; closedir THISDIR; }
This passes all of the (file) names in the "students" directory to @allnames where you can use it to update or delete the individual student files.

Either way, don't forget to lock your files with FLOCK.

Replies are listed 'Best First'.
Re: Re: Which is the better option?
by kiat (Vicar) on May 19, 2002 at 08:53 UTC
    Hi dev2000,

    Am I right to say that, provided I set up the flat-file correctly, I should in theory be able to use the flat-file just as I would a DB_File database file? The differences between the two methods would be in terms of speed and efficiency.

    But in terms of doing what I intend to do with the flat-file (modifying the entries, deleting, appending or just merely reading), I should have no problems using a flat-file. Am I right? What if the flat-file gets very large? Is a large flat-file inherently more prone to error when entries are appended or modified and written to the file?

    If flock is used, does it mean that when the file is opened for writing, another process that requires the file for writing (but not reading) will not execute and hence the first request is guaranteed to execute successfully?

    I look forward to hearing from you :)

    kiat
Re: Which is the better option?
by dev2000 (Sexton) on May 22, 2002 at 18:12 UTC
    Hi Kiat,

    I think for small applications (few users, small data set) flat files will be an easier, faster and more efficient choice.

    It's when you have a lot of people connected to a large amount of data that a database like MySQL will scale well and shine. mod_perl even allows for persistent, reusable database connections which further increase performance. I'm doing a lot of research into this area now and converting a flat file system to a MySQL database. My goal is to try to handle thousands of people at the same time (with a low amount of data, though).

    A flat file (large or small) is prone to corruption when two people try to write to it at the exact same time. This is where flock comes in. Simultaneous read access of the same file is not a problem, but consider this scenario:

    I open a file. I read a number from the file. I increment the number. I close the file. I open the file for writing with a lock. I write my number. I close the file.

    You read the same number at nearly the same time. You increment the number. You go to open the file to write the new number... it's in use by me. You wait. You then write your number.

    In the end, the number was only incremented once. You can see how this example would fail as a counter.

    I think the only way around this is to open and lock the file for reading and writing simultaneously, which can be tricky...

    In a database this is handled automatically.

    In my understanding, flock works on a system level and acts as a traffic signal for files. For example, if I'm currently writing to a file and you want to update it too, you must wait until I'm finished. Your process gets stalled until I release the lock, then you go. Normally we are talking about milliseconds, so performance is not a problem.

    Closing a file automatically unlocks it, so you dont have to. Also, flock is UNIX based, and I don't think Windows supports it.

    Good luck - I hope this helps. Any other questions, just ask!

    :DEV2000