Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: best data model for chat program

by {NULE} (Hermit)
on May 21, 2002 at 22:56 UTC ( [id://168308]=note: print w/replies, xml ) Need Help??


in reply to best data model for chat program

Hi dev2000,

In writing my own web chat room (or demo here) I spent much time deciding upon appropriate data structures. Why? Because with the right data structure the program writes itself.

I ultimately decided that for my purposes using a "real" database like MySQL or PostgreSQL was overkill. Never the less the ideas that I worked out should apply in concept to any database system - and at least in my opinion work well with the whole chat-room concept.

What I decided to use was two basic tables, one for users and one for messages. The users table had a hard limit and when that was full - too bad. The messages table also has a hard limit, but when that is full it just wraps around to the beginning (this is my famous, off-by-one-prone linked-list-like data-struct {g}). Basically the two tables are arranged like so:

Users: UID | Host (IP) | Last-time | Name 0 | 10.0.0.1 | (epoch sec) | User name 1 | 192.168.... Messages: MID | Time | Name | Message.... 0 | (epoch sec) | User name | What they said... 1 | .....
Now I had to jump through hoops with a DBM file database to actually encode all this information into the database, but this is what the structure actually boils down to.

The next version of my own chat program is going to feature user-created rooms as well as XML options so that non-web clients (*ahem* Perl/Tk) can be written to work with it. Private messaging will also be supported. What kind of data structures will I use for that? Hopefully the kind that will work best. :)

For starters I'll be adding another table to hold the rooms. That table will probably contain a list of each user in the room and probably a field with room properties. Included in that will be whether a room is private or whether it is a special "room" that is just used for private messages. Now the question is whether you want the users persistent or temporary.

A nice feature of the Monastery is that I can keep my messages around until I get around to them (or until the DB fills up and spews its guts everywhere - the one or two compliments I have received on my postings are much needed ego boosters to one with such a fragile ego so I need to keep them around ;) ). This can be dangerous on a general purpose internet chat server (think IRC here), but it could be an awesome feature too. I'll get back to this (at some point in my rambling).

But back to what I was saying. </segue> With the concept of rooms, message boards become transient properties of items in the room table. In other words between this and the idea of having permanent users deprecation of information becomes critical. Using my concept of file system databases that means at some time the files themselves must be deleted. Deciding upon criteria for deletion is up to how you wish to advertise your software. Personally I like the idea of coming back to a chat room at any time and seeing what happened in the past week. But again, think IRC - if your service became that popular could you really handle the storage requirements? However if you applied the concepts of the user table to the room table (i.e. a hard limit - 200 rooms? sorry, I can't create any more and i.e. oh you want a new room? OK, but give me a minute or two to cull some old ones no longer in use) you might be able to balance your desire for persistent storage with your desire to be able to afford your hosting.

So while the concept of a user table remains similar with a multi-room chat application, the message table changes form and some additional information is needed in the shape of a room table. The last thing you'll need to consider is how to handle culling of deprecated information. I think that how I am going to handle this with version three of my own application is to take relatively rare events (a user logging on or requesting a table list) and having their process (poor user...) perform the culling action. But why cull across the entire system? I love biology and model my own programming after it. Pick a number of tables at random each time each time these rare actions occur and check them for deprecation. That limits the poor user's wait time and still ensures the necessary action occurs. Why random? Occasional random actions are annoying, lots of them are life itself.

Good luck in your code. Pardon my waxing poetic - I should never drink and post.
{NULE}
--
http://www.nule.org

P.S. I hope this doesn't come off as just one giant p1mp of my own program. :) But I must say, it is listed here in the monastery as well. It is not the most recent version there, though - it is growing large.

Replies are listed 'Best First'.
Re^2: best data model for chat program
by Aristotle (Chancellor) on May 22, 2002 at 00:02 UTC
    IRC uses a very simple way of handling rooms and private messages: in addition to a From property, all events also have a To property. There is no intrinsic separation between the messages addressed at different rooms. It's a very natural way to go about it, I think.

    Makeshifts last the longest.

      I have never looked much into the technical aspect of how IRC functions. I do appreciate the information though. My primary reason for comparison between a web-chat type application and IRC is to make sure the programmer is aware of the sheer volume of information to be handled and how that may affect a storage based message system. Store everything is cool, but eventually volume will get you without some means of deprecation.

      Again, good information, Screamer - as I think about doing a version 3 of my script I'm going to keep this in mind.
      {NULE}
      --
      http://www.nule.org

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://168308]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2024-04-19 06:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found