bugsbunny has asked for the wisdom of the Perl Monks concerning the following question:

hi,
I have some code that uses flat files with the following format :
key|field1=val1|field2=val2|field3=val3|
I have successfuly used Tie::StdHash to put this into :
$hash{key}{fieldN}
structure...up to now everything is ok...
Now as u may alredy guessed the hard part is to, make STORE works efectively..
The simplest way is to just rebuild the text file into string and then slurp the result into the file every STORE
What other tehniques i can employ to make things better
The things become worse 'cause I will have not one but several files and I will want to be able to support simultaneous access via flock() probably..
So if i make some sort of caching i have to do longer locks..
FYI I dont expect too many simultaneous access, but has to have it as a feature...
One more thing I dont want to use SQL or DBM, why :
- the files will be maximum several thousand entires/lines
- I can edit the files with text editor.
- I can easly grep files with regexes to make some stuff that are horribly hard with SQL.
- DB is noneditable, but is also a option if lines goes over 10 000 :"), 'cause it too can be hidden behind a HASH

If I decide to move this into SQL DB, i will probably hide it behind HASH access(there is such module AFAIK), that is one of the reasons I move to Tie::Hash.

Replies are listed 'Best First'.
Re: tie::hash files..
by Abigail-II (Bishop) on Jan 12, 2004 at 14:17 UTC
    What other tehniques i can employ to make things better
    That highly depends on your definition of 'better'. Considering your requirement of:
    I can edit the files with text editor.
    all typical methods to make access or store faster are disqualified.

    There really isn't much Perl into this question. The fact that you are using a tied hash as part of the interface hardly matters for the answer - the tied hash is just syntactical sugar, and your question is really about the internal structure of the data file. And considering your requirements, you're most likely going to end up with flat files anyway - including full dumps. Regardless of your choice of implementation language.

    You could of course read in all data at program start, and dump it on program termination, but that won't leave much if the required parallel access.

    I'd use a SQL database server, where all the problems already have been solved for me. I'd be grateful that it takes away the temptation to modify the data in an editor - although it's not going to prevent me from doing it (most databases will allow you to dump the data of one or more tables to a text-editor edable file, and allow you to read in from such a file as well).

    Abigail

      hmm.. u are maybe right,

      what I had in mind for "edit with text editor" is not that I have file open for hours, but just quick edits.. (i'm searching to make MC lock/unlock on file/edit<->exit)
      and in most of the cases giving other parties "file is locked, pls try again later" is not a big problem... 'cause the the other "parites" sit next to me :")
      this is mostly a precaution measures..

      SQL is a beast for my current task.
      - need alot of prerequisites
      - hard to move and change frequently
      - will be slower, i think
      will be doing it when i have a clearer picture of the whole thing..
      for now the question is how to make EDITS of FLAT faster ? with perl ofcource :")
      thanx anyway
        what I had in mind for "edit with text editor" is not that I have file open for hours, but just quick edits.
        Indeed. And that limits you a lot. Indices would be out of the question, which means you don't have performance.
        SQL is a beast for my current task.
        - will be slower, i think
        Will be slower than what? A hand-rolled solution in Perl where you lock entire files, and dump the entire data structure on each modification?
        for now the question is how to make EDITS of FLAT faster ? with perl ofcource :")
        Lower your requirements.

        Abigail

Re: tie::hash files..
by ysth (Canon) on Jan 12, 2004 at 17:20 UTC
    You may want to have a look at Tie::File, which has a read cache and deferred write mode. (It ties an array to lines in the file; you would want to munge it to tie a hash based on the key.) But the caching isn't going to work for you if you really need simultaneous access.

    You've constrained your problem so well it becomes hard to solve except with whole-file writes on every change. About the only thing I can think of that would help would be to have a maximum record length, and pad the end of each line out to that length. Then you can update a single line on each STORE. But it makes it easy to mess yourself up when you directly edit the file.

      thanx alot it seems i'm doomed :")

      it seems i will be better of doing lock until some of my scripts work and accumulate changes and at the script end do a flush..and unlock.(they are short quick scripts)
      If I place flush logic in the DESTROY method will it be called when the script exits.(if i forgot to do it)..
      I mean if I forgot to untie.

      tia