Re: Re: Re: Total speculation?

Replies are listed 'Best First'.
Re: Re: Re: Re: Total speculation? by BrowserUk (Patriarch) on Oct 03, 2003 at 23:34 UTC
I think that we are mostly in agreement. I agree that a complete re-write is not on the cards, for the reason you gave of time, but also because I think that starting from scratch would be throwing away a lot of good code which is just wasteful. Given this isn't going to happen, what are the alternatives? I'm not sure that migration to Everything 2 would be helpful either. Whilst it would make it easier for individuals or groups to set up replications of PM, there would still be a whole lot of customisation in the core, and sensitive data that would be impossible to share openly and very difficult to mock up. Even if this were done, it would still mean that contributions from outside would need to be submitted to PM with a promise of "I've tested it thoroughly and it's fine!", which isn't going to work. The gods would still need to inspect for backdoors and malicious failures and test to their satisfaction. The same bottleneck would exist. The only way I could see of alleviating the bottlenecks in the testing and approval mechanism -- beyond recruiting 100 new gods (Maybe from the Indian subcontinent, they seem to have more than their fair share:) -- is to make it possible for PMdevers to test their own code in a realistic, but non-critical environment, and in a way that allows the gods to verify the varacity of their testing (by inspecting the logs of the test system to check for errors, the number of times the modification has been exercised etc.). The only way I can see of doing that is for the test environment to be in the same box and sharing the same (live) data. Obviously, ad-hoc changes to the live system aren't desirable, so logic led me to suggest a test server with limited bandwidth/cpu accessing the same data except for updates. From my, very limited, external viewpoint, this is the only possibility that addresses the problems. The alternative is to stick with the status quo, which while an option, and currently the only game in town, is the reason for the disquite in the first place. The idea is far from unique. Having test systems that referenced live databases read-only and wrote updates to a different database or a seperate table within the database was once common practice when disc storage was too expensive to replicate whole databases willy-nilly. This is just an extension of that idea attempting to work around the specific PM peculiarities. It cost nothing, except a little of my time to write it up, and a little of your time to read. I don't have a good enough view to know if it is feasible or practical, but I thought it worth mentioning anyway. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller If I understand your problem, I can solve it! Of course, the same can be said for you.	[reply]
Re: Re: Re: Re: Re: Total speculation? by perrin (Chancellor) on Oct 04, 2003 at 00:11 UTC
starting from scratch would be throwing away a lot of good code which is just wasteful I don't buy the whole Joel Spolsky theory that you should never throw away working code. Sometimes the existing code is written with assumptions that no longer apply, and removing those assumptions piece-by-piece -- or even discovering what those assumptions were! -- is just too difficult. It's a moot point though, given the amount of work it would likely require. I'd be happy to see PerlMonks drop lots of features that I consider bloat, but every feature seems to have some individual who swears they can't live without it. I'm not sure that migration to Everything 2 would be helpful either. Whilst it would make it easier for individuals or groups to set up replications of PM, there would still be a whole lot of customisation in the core, and sensitive data that would be impossible to share openly and very difficult to mock up. Everything2 has moved the code out of the database and into CVS (at least that's what chromatic told me in another thread), and that's a major structural improvement. Hopefully it provides a better approach to customization as well. As for the difficulty of creating test data, everyone always has an excuse for not doing this, but it's important. Creating test data would mean that developers who don't have access to the live PM database could actually test their code! The gods would still need to inspect for backdoors and malicious failures and test to their satisfaction If everyone could download and run the code easilly, then everyone could help with this too. The only way I could see of alleviating the bottlenecks in the testing and approval mechanism (...) is to make it possible for PMdevers to test their own code in a realistic, but non-critical environment With access to the code, an easy install, and a test suite, anyone's laptop could be a realistic but non-critical environment. Your idea might work as a stopgap measure, but I don't think it addresses the real problem, which is the difficulty of contributing substantial well-tested code.	[reply]
Re^4: Total speculation? by demerphq (Chancellor) on Jun 16, 2004 at 17:22 UTC
One thing I find interesting is the idea that moving code out of the database would improve things. I dont think it would. In fact my tendency is to go totally the other way. Consider that in order to deploy changes to PM that arent in the DB we need to ssh into at least two boxes, upload the perl modules and then force a server restart. Wheras we can make on the fly changes to in-db code and have it automatically deployed seamlessly. I really dont think having the code in CVS or equivelent would particularly helpful nor do i think it would increase the number of contributors. On the contrary in fact. One aspect of the design of PM (and Everything) is that nodes are both suboutines and objects. CVS'ing the code would severly impact on the objectness of the code. Consider something like patch display page. Out of context from the monastery that code means pretty much nothing. In context its a window into the soul of the system. Putting it in CVS would lose the important part. It would be kinda like taking a window with a fine view and sticking in a warehouse and then wondering why it didnt look as good. A bunch of us in pmdev have even discussed moving the entire Everything code base into the DB and the boot strapping from that. We wont ever do it of course but the fact that we even think its a good idea suggest that there is something to this point. As a last aspect, PM itself is a stones throw away from CVS anyway. We can currently cross diff patches both on a single site or between the two. We can view a nodes patch history, and etc. We can and will expand these features as well. pmdev is very much alive and functional these days, and i dont think it would be if it was that horrible to work with. --- demerphq _{First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi}	[reply]
Re^5: Total speculation? by perrin (Chancellor) on Jun 22, 2004 at 22:32 UTC
First, I want to say thanks for your contributions to the site. Your patches are much appreciated. There are a few reasons I think that puting the code in the database is a bad idea, and I speak from experience on this, having started my serious programmimg career with a system that stored all code in a database. Principally, there are lots of great tools for working with files (CVS, grep, diff, emacs, rsync, etc.) and all of them have to be reinvented when the code is in the database instead. At the very least, you need something to import/export the code from the database. Writing test scripts becomes much more difficult. Getting a working copy of the current code becomes a chore. You mention the difficulty of updating. This is not a hard problem to solve. Simply using Apache::Reload will avoid having to restart the server. However, you'll shred your copy-on-write shared memory this way, which must already be happening with the current system. That hurts the scalability of the site, since it lowers the number of processes you can run without going into swap. Keeping code in CVS (or Subversion or whatever) is the expected standard, and there's a good reason for it. It allows you to do things like branches, which are not possible with a simple revision system. It also allows people who are familiar with other open source projects to get started quickly. Creating RCS-like functionality in PM itself is not a substitute for the full power of source control. I also don't really buy into the idea of this being more object-oriented. Looking at the node you linked to, I see no POD, no easy way to write a test script, no easy way to run perltidy on it, no way to perldoc it if there was POD, etc. The bottom line is that putting all the code in the database is non-standard and unnecessary. It makes things much harder for new people who are familiar with other open source projects and are interested in getting involved with PM. It obviously has not prevented work from going on, but I consider it a clear negative, and I suspect many others would agree with me.	[reply]
Re^6: Total speculation? by demerphq (Chancellor) on Jun 23, 2004 at 10:07 UTC
Well, color me unconvinced for this case. And I just had an example of why. tyes funky html error catcher, nesting enforcement thingee doesnt handle `</br>` in a way that I think it should. IMO such tags that arent matched by an opening br tag should be silently "fixed" and not marked as an error. (This is not a complaint tye, its a feature request. :-) Anyway, being pmdev etc I figured instead of annoying tye id just hack in the appropriate changes. So first i searched for the patches that created the changes, I found User Settings used certain keywords that i could do a code search from. So i did the search, and found that in fact the code used to do it is burried in a PM module. The consequence is that no patch is possible within the time that I have to donate. The simple act of putting that code in a PM module (which afaict is due to historical reasons) means that in order for me to patch that code i need to do the following: 1. download the PM module. 2. Patch the PM module. 3. Upload the PM module to two seperate machines. 4. Stop the httpd's on those machines (since we aren't using Apache::Reload) and then 5. restart them (er, i guess i could wait for the periodic restart, but then i wouldnt necessarily be around to see the code go into production and respond to any issues it raised.) Thats a lot of work to effect what i suspect would be only a few line of code. Now had that subroutine been an htmlcode node I could have done the following: 1. find the node, 2. patch the node just as you would write a note, 3. Push the patch to pmdev, 4. Apply it. 5. Assuming it worked, apply it on PM propper. So all told becuase its in a module the application and nondevelopment work required to get the patch in production would outweigh by many times the actual development time involved. This isnt a good use of my time, and frankly it is unlikely I will _ever_ do this. Now I think the problem here is a methodology clash. PM is a production enviorment desgined to allow in use/inline development. All of the techniques and tools that you mention are oriented towards a far less dynamic, develop-test-rollout-repeat methodology that is slow and requires vastly more investment in development infrastructure. Sure developers are more comfortable working with their normal tools. But frankly im not in the slightest bit convinced that said tools would make development easier. In fact im of the opinion that if that were the case there would be virtually NO development on PM at all. Certainly there would be no opportunity to have a pmdev group interacting in a communal forum. Now I recognize that your arguments have considerable merit. But I have to say that they have merit in other contexts. They have merit for the Everything Engine itself. They have merit for larger projects where the code will be duplicated and deployed in multiple sites with no reference to each others implementation other than the fact they share the core engine. They have merit when there are a considerable number of developers who can spend signifigant amounts of time managing the development and rollout process (ie, being the Perl Pumpking is a full time job practically, Linus and his leutenants represent multiple full time jobs) But in this context its rare that a god or pmdever will have more than a few hours at a time to work on the site. So all in all I think that you are arguing that the methodology of one type of development should be used in an utterly different context, a context so different that i think the methodology would kill all development outright. Now, id like to address some of your other points. Testing is a good example. I see no reason why PM cannot use the exact same testing methodology as is normal in Perl. Insofar as functionality is written in htmlcode nodes it is trivial to create a new htmlcode node that is called by a superdoc that will run the tests and report any issues directly. In fact the PM DB driven framework offers a large number of opportunities in this area that i cannot envision being effective in a conventional development enviornment. (Such as tightly binding test code to the code it tests, such as tightly binding documentation to the code it documents, such as allowing multiple people to simultaneously work on different parts of a system in a collaborative fashion without wildly forking the build.) You mentioned that you worked on a DB based code system before. You didnt expand very much on it so i dont know what type of enviornment it was, but i think that you have to consider PM in an exceptional light. All of the problems you outlined have already been solved (uploading code, extracting code, etc). The site itself is in essence a huge collaborative development and work space on its own. I see only one real advantage to the approach you outline, and that is that it would increase the comfort of new developers with similar background as you (and many others :-). For instance I have never used CVS in my life, I'm a Win32 developer who uses SourceSafe instead, I've never used Emacs beyond the "why the **** don't the keys do NORMAL freaking things!!! &$%$%$@@!" point. As for POD, we could easily implement a "node POD view" for pmdevers that would pass any node through pod2html before rendering the code. Ditto for Perltidy, a link in pmdevnodelet could automatically pass code through Perltidy. The perldoc comment is superfluous as we have a host of more powerful ways to search and scan code and documentation as part of the site. As a last point I think the only real thing stoping experienced programmers from getting involved in pmdev development was the lack of test/dev enviorment, and a paucity of gods willing and able to do code reviews and code application. Both of these problems are now resolved. So the only thing stopping folks from digging in is a lack of creativity (I'm a good programmer, but by no means brilliant, if I could figure out ways to develop and test off site _before_ the pmdev server existed then I know others can too) and an unwillingness to think outside of the box. (As in the box that represents "normal" development methodology.) Another point is that in order to do useful stuff on PM you need to understand the architecture. IMO it would be almost impossible to grok said architecture without a live site to play with. Anyway, thanks for your kind words. :-) --- demerphq _{First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi}	[reply] [d/l] [select]
Re^7: Total speculation? by perrin (Chancellor) on Jun 23, 2004 at 20:00 UTC
Re^7: Total speculation? (HTML fixing) by tye (Sage) on Jun 23, 2004 at 22:57 UTC
Re^8: Total speculation? (HTML fixing) by demerphq (Chancellor) on Jun 23, 2004 at 23:11 UTC