Queues, not the usual problem

rob_smith01 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Queues, not the usual problem by pmas (Hermit) on Jul 09, 2001 at 17:25 UTC
Another issue what can arise with mySQL and why I advise NOT to use mySQL: mySQL lacks concept of transactions, as was discussed here or here. Try SuperSearch with "MySQL" and "transaction". You need transactions-capable database to make changes either in both tables, or none at all in case of any error. What does 'no transactions' means for you? If something will go wrong, you may eliminate item from the queue, but related info will not be eliminated, or vice versa: related is out, item still in queue. Then next time - guess what can happen? Remember Murphy's law: If anything can go wrong, it will. pmas To make errors is human. But to make million errors per second, you need a computer.	[reply]
Re: Queues, not the usual problem by lhoward (Vicar) on Jul 09, 2001 at 17:26 UTC
As arhuman mentioned above, MySQL may not be ideal because of its locking mechanism. However, a slightly different architecture (used by many queueing systems) could work well for you (with MySQL). process(es) that insert into the queue - generators quque manager process process(es) that remove from the queue - consumers With a single queue manager that is responsible for assigning queue ID's to new queue entries, and giving queue entries to the consumer processes you avoid the MySQL lock problem. You could use MySQL (or even flat files) to backup the queue manager in case it failed or you have too much data to fit in ram at once, or the queue manger could just direct and queue the data directly between processes.	[reply]
Re: Re: Queues, not the usual problem by rob_smith01 (Acolyte) on Jul 09, 2001 at 17:48 UTC
Ive decided to write a script and spool the data instead due to dead lock. Its bound to happen with the amount of data Im using and processing.	[reply]
Re: Queues, not the usual problem by arhuman (Vicar) on Jul 09, 2001 at 16:42 UTC
Just my 2 cents comment on your project : As Mysql lacks 'row lock' (only 'table lock') you must expect serious performance penalty when inserting/selecting/deleting a lot at the same time... (Mysql is much more efficient with few update and a lot of selects..) Anyway you could use our monastery Super Search which will bring you post like : Perl/DBI (MySQL) tutorial which is a good start... Or this one for interesting external links... "Only Bad Coders Code Badly In Perl" (OBC2BIP)	[reply]
Re: Queues, not the usual problem by eduardo (Curate) on Jul 09, 2001 at 17:39 UTC
A lot of people a lot smarter than either you or i have spent a lot of time writing extremely efficient queueing mechanisms that are guaranteed to scale a lot better than anything we're going to be able to write in Perl. I wrote a proof of concept application a month or two ago, and for my queueing mechanism I used Qmail. I could just as easilly have used sendmail or any of the MTA's out there. Why? Simple... guaranteed delivery, well understood API, guaranteed locking, guaranteed atomicity, you can bundle any data you want with each queue request (just bundle it as a MIME attachment) etc... I'm going to argue that this is one of those situations where you just don't want to bother... I mean, hell, Java had to invent the entire JMS (Java Messaging System) to create persistent queues and a relatively performant API, and even with all of SUN's resources behind the project, a lot of people are chosing simpler solutions... Seriously, don't reinvent the wheel... write 2 programs, one that sends email, one that checks the Inbox... you'll be happy that you did.	[reply]
Re: Queues, not the usual problem by mattr (Curate) on Jul 09, 2001 at 19:01 UTC
You might consider PostgreSql too, although ten simple operations per second seems well within Mysql's capabililties unless your server is heavily loaded. It is also possible that you could manage this size queue just fine either in memory, as a flat file perhaps using Tie::Array, or perhaps with a BerkelyDB (which excels as an object storage mechanism). How about just pushing onto / popping off of such a tied array, possibly with a Perl DB as object store? Also, some basic code for a Queue is found in the book, Elements of Programming With Perl	[reply]
Re: Queues, not the usual problem by suaveant (Parson) on Jul 09, 2001 at 17:22 UTC
I would suggest having a table with an autoincrement field, so your query would be SELECT * FROM table ORDER BY row LIMIT x; where x is how many lines you want to process... To handle row locking you could add a status field, and then do your insert, anything else you need to do, then do an UPDATE table SET status = 1 WHERE row = LAST_INSERT_ID(); and then change the select to check for status = 1... but I think as long as your insert is only a single simple insert that it is not a problem... Update bad suaveant, didn't read the post right... use the row id from autoincrement to key into your second table, and then do the update to set the status = 1 in the main table... that should do you good. So do main insert, then SELECT LAST_INSERT_ID() FROM table1; and use that key to populate table2 with the proper key, then UPDATE table SET status = 1 WHERE key = $key; Should work great - Ant	[reply]
Re: Re: Queues, not the usual problem by arhuman (Vicar) on Jul 09, 2001 at 17:33 UTC
I'd alternatively suggest (from the MySQL manual) : If you mainly mix INSERT and SELECT statements, the DELAYED attribute to INSERT will probably solve your problems. If you have problems with SELECT and DELETE, the LIMIT option to DELETE may help. "Only Bad Coders Code Badly In Perl" (OBC2BIP)	[reply]
Re: Queues, not the usual problem by rrwo (Friar) on Jul 09, 2001 at 17:27 UTC
Databases have a lot of overhead, so I wouldn't go there. Maybe drop your elements into a directory, open a directory handle and read them in date order (maybe not so easy if you have to sort a million file records). You could split them off into chucks... a manageable chuck of records added to a directory and an index added that is processed. Then the second process looks for another index... Disk space shouldn't be an issue since a million elements in a database being constantly added and dropped is likely to take more space and have fragmentation issues related to this, etc.	[reply]