Re: Re: Re: Re: redesign everything engine?

By grabbing the latest Everything from CVS, you're kind of high-lighting the problem here: there is no current CVS of PerlMonks, because the code is kept in the database. There is no convenient way to get the latest code, let alone branch it for a major revision. It also makes the task of incorporating updates from Everything that much harder. This is why I think storing code in the database is not a good idea at this point. I'm sure there were reasons for it at the time, but it is counter-productive now.

When I referred to XML, what I was really thinking of was the way nodes are stored and the resulting update problems (some of them are described here). I don't think this would be such a problem with a more normalized database schema and a codebase that allowed for finer-grained locking during updates.

About the cache: subrefs are okay as long as Storable can handle them. Objects that can't be serialized can't be cached between processes at this point. At the moment, Perl threads are not very good at sharing objects so mod_perl 2 may not solve this issue any time soon. I'm not sure what your managed-forking idea is, but I don't see why it wouldn't have to deal with exactly the same issues mod_perl does.

I don't want to sound like I'm just whining about the code. I am grateful for the existence of this site and your part in creating the code that made it happen. I do think that some of the design ideas have not scaled well though, and that it will be hard to fix it completely without fundamental changes.

Comment on Re: Re: Re: Re: redesign everything engine?

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: redesign everything engine? by chromatic (Archbishop) on Jan 29, 2003 at 06:21 UTC
I do appreciate your comments, perrin, and you're the first person I'll ask about an inter-process cache. All of the code for the base install of the system is stored in CVS, though -- including the core nodes. It would be nice to do this with Perl Monks as well. (There'd probably be three or four specific nodeballs.) I'm planning to revise the XML format slightly so it's even easier to see changes between node revisions. An inter-process cache with its own locking mechanism could help, but there are other ways to avoid it. I'm inclined to propose a rule that all updates are commited to the database at the end of a request. Any suggestions to improve the normalization of the database are welcome. For speed reasons, I'm tempted to move the `doctype` `doctext` field to a separate table. I'm definitely going to fix the hacky `settings` by making a one to many table for individual settings. That's another post 1.0 change. The problem with caching subrefs is that you'll still pay the eval() penalty. I'd prefer to cache any calculated field, though, as we do many times more reads than writes. That seems like a web-side enhancement, but if we have an interprocess cache, we can avoid many database hits, which will help. My managed-forking updates the parent process whenever the cache changes, so the cache is always in the parent. This includes code. All forked children share that memory. I've not found a way to do that with Apache. Finally, I agree about fundamental changes. That's my plan. I'm just changing the existing code, not starting over.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re: Re: Re: Re: Re: redesign everything engine?
by chromatic (Archbishop) on Jan 29, 2003 at 06:21 UTC

I do appreciate your comments, perrin, and you're the first person I'll ask about an inter-process cache.

All of the code for the base install of the system is stored in CVS, though -- including the core nodes. It would be nice to do this with Perl Monks as well. (There'd probably be three or four specific nodeballs.) I'm planning to revise the XML format slightly so it's even easier to see changes between node revisions.

An inter-process cache with its own locking mechanism could help, but there are other ways to avoid it. I'm inclined to propose a rule that all updates are commited to the database at the end of a request.

Any suggestions to improve the normalization of the database are welcome. For speed reasons, I'm tempted to move the doctype doctext field to a separate table. I'm definitely going to fix the hacky settings by making a one to many table for individual settings. That's another post 1.0 change.

The problem with caching subrefs is that you'll still pay the eval() penalty. I'd prefer to cache any calculated field, though, as we do many times more reads than writes. That seems like a web-side enhancement, but if we have an interprocess cache, we can avoid many database hits, which will help.

My managed-forking updates the parent process whenever the cache changes, so the cache is always in the parent. This includes code. All forked children share that memory. I've not found a way to do that with Apache.

Finally, I agree about fundamental changes. That's my plan. I'm just changing the existing code, not starting over.

[reply]
[d/l]
[select]