in reply to RFC: Distributed/replicated TIEHASH and shared state algorithm with Corosync

Maybe they could incorporate this concept into Git itself? :-) Sort of a realtime updated Git repository? But wait, I thought that one of the advantages of an offline Git repository is that you can work if the line goes down. What sort of re-synchronization process does this have in case a few of the realtime sockets fail?

I don't work on these things except as mathematical abstractions in my mind. :-)

After seeing how jquery and ajax work, I beginning to believe that eventually we all will be running these supposedly non-blocking event-loop programs, in all our software, javascript will have won in the end. Every square inch of our screens will be controlled by 1 event-loop or another, dealing with their own sockets. It's called Web 2.0 I believe. :-)

But if I was working on it, I would ponder how to resynchronize after a communications failure. Like would the histories of all changes, which occurred during the down time, be replayed for the benefit of the central repository tree.

I was recently watching a youtube of Linus Torvalds, and he said essentially that realtime central systems are not good. Everyone should run independently and update each other on a regular basis.

Another thing Torvalds talked about, was the way that in Git, the node names are the actual md5sums of the diff data in the node. This assures perfect replication... what comes out is what went in, or error flag.

But these are just the ramblings of someone having a good christmas. :-)


I'm not really a human, but I play one on earth.
Old Perl Programmer Haiku ................... flash japh
  • Comment on Re: RFC: Distributed/replicated TIEHASH and shared state algorithm with Corosync

Replies are listed 'Best First'.
Re^2: RFC: Distributed/replicated TIEHASH and shared state algorithm with Corosync
by dave_car (Novice) on Dec 30, 2013 at 15:02 UTC

    I hope I didn't overload the word "hash" too much - this is as in the Perl sense (%hash, $hash{key}=value) rather than MD5/SHA1 type hashes. Tie your hash variable with this and becomes replicated over the network.

    At the moment it is an in-memory object so more suitable for tracking state of a system rather than a permanent repository, but no reason why it could not use a disk-based backend and do initial state transfers with a separate channel (current implementation limits the full database to be ~1MB in size).

    Resync after failure currently involves a full state transfer, but it could be adapted to checkpoint periodically and transfer diffs (quorum to avoid splitting the cluster would be advisable too). Also it uses multicast so it's more of a local thing rather than general internet use.