Re^4: RFC: Apache2::CloseKeepAlive

The DB-building program can be re-run at any time, and should be after changes to the site. The only purpose in running it, as part of the installation process, is that there has to be a DB to run the tests that everyone expects a CPAN module to have. You as installer probably never know that the program is run.

I don't know what to do with "I might not want all htmls and jpegs included in your database". First, at that point it's your database, not mine. Second, the database consists simply of a hash from .html filenames to the number of .jpg's each calls out. Why would anyone care whether all the files that are linked to the front page are included?

Such a DB is not likely to "overflow the /var space". What does that mean anyway?

How can an Apache2 installation be so non-standard that the DB-making program can't follow the links among its .html files? Keeping a script from "doing something bad" is a normal part of module development and deployment.

"the unexpected creation of files in specific directories leads to emergency shutdown of the machine" sounds like "turn off your anti-virus software before installing"?

Half the people (1 of 2 :-) say "I don't think a module installation should ask questions" and the other half say "The sysadmin might have very different ideas about where your database should be put". What's a poor module-installer-designer to do? It's not possible to ask a question to determine whether any questions should be asked!

It takes a few seconds to run the DB-making script on my site, not 20 minutes. But I'll put in a check that if the DB already exists, it won't run the script.

You probably don't like CPAN modules for which the testing phase takes minutes.

"there were some changes necessary in the database": the DB reflects the contents of the site. It and CloseKeepAlive are a closed (private) system. "Changes become necessary" only when the site changes.

Installing a module intended for Apache servers, on machines that aren't Apache servers, is something I hadn't thought of. I suggest you not include CloseKeepAlive in your general distribution of CPAN modules.

The first thing that Apache::Test does is to ask "where is your Apache server?" If that question doesn't lead to starting up a test server, the (rest of the) tests don't get run. So I guess I should move the running of the DB-making script to the testing phase. Does that help with your concerns?

The pod docs will be updated to cover running the DB-making script. The posted page was just to give people an idea what CloseKeepAlive is about.

Despite all that, thanks for replying. Each correspondent contributes things to the packaging, testing, and installation. Except that "whether to ask where to put the DB" looks like a true conundrum.

Comment on Re^4: RFC: Apache2::CloseKeepAlive

Replies are listed 'Best First'.
Re^5: RFC: Apache2::CloseKeepAlive by jethro (Monsignor) on Aug 28, 2009 at 18:28 UTC
there has to be a DB to run the tests tests can and should be done with test data not live data. If you find no html and jpegs on the machine (for example because the data wasn't deployed yet) do you just drop the tests? "your database"-> "the database of your module". You ask why anyone would care? For example a part of the website might need unchanged connections because a legacy application uses a packet sniffer to generate some statistics, or some custom application connecting to your website doesn't work correctly with your module (I don't know if this is a realistic possibility). What if part of the apache data directory is a loop-back filesystem with virtual files whose content is dynamically created by a program? No sense in including that into the database "overflow the /var space" -> "fill the /var partition until the partition is full, i.e. there is no disk space left" How can an Apache2 installation be so non-standard that the DB-making program can't follow the links among its .html files? Does your script work correctly with circular symbolic links (i.e. links that point to directories above it), or would it go into an endless loop? Are you sure you have thought of everything else and eliminated every possible bug that could rain on a simple install? I'm not arguing that software has to be absolutely bug free. My point (with all of these examples) is that a sysadmin or potential user is expecting a cpan installation of a module to install, not to run lots of fancy routines, that could error out or even crash, as part of the installation. If someone just wants to evaluate your module or look at the source he doesn't want (or expect) a database to be created. Especially on security-concious installations he doesn't want there to be databases he doesn't know about and later has to find out who or what created that thing. Sysadmins don't like surprising behaviour. The same goes for testing. Testing should not create a live database. If you need a database for the test, create it with sample data in /tmp or your test directory (and delete it afterwards) or use mock-up techniques because you can't expect there to always be live data. (PS: mock-up in this case means to simulate a database interface with a script that behaves just like the database, see for example Test::MockDBI)	[reply]
Re^6: RFC: Apache2::CloseKeepAlive by cmac (Monk) on Aug 28, 2009 at 20:56 UTC
The only thing that CloseKeepAlive ever does is to allow the connection to be kept alive or not. Other elements of Apache also take part in this decision. Only very-low-level network diagnostics would care about the difference. The DB-making script doesn't care about static vs. dynamic generation, it uses LWP to fetch the pages without regard for how they're generated. But your point is well-taken and I'll add exclusion mechanisms to the DB-making program, namely "don't process this/these file(s)" and "don't process files in/below this/these directory/ies". The script is OK w/r/t back-references because my web site has links pointing everywhich way from almost everywhere. Probably I haven't "thought of everything else and eliminated every possible bug". Does a "simple install" include testing? If so, I'll stipulate that I haven't. The DB built during testing will in fact be in the 'blib' folder of the build directory. The question is whether this DB wants to be copied out into the site along with the module, docs, and building script, if the testing passes and the install phase occurs. Lots of people would say "yes" because this makes the module more "ready to go out of the box", which is considered a big plus these days. You're obviously saying "no". I need more data to decide. If such copying is done, it is noted in the install log, which (at least in a formal sense) notifies the person running the test so that he/she has slightly less grounds to be surprised. If someone wants to evaluate a module in the sense of reading about it, including looking at the source, he/she can do that on CPAN. The presentation is better than perldoc or man files created during installation. In the general case (though maybe not for this module) good testing definitely includes "lots of fancy routines, that could error out or even crash, as part of the installation". Apache::Test will fire up a test server for my tests to use, and it will have something in its ServerRoot directory. Typically this will be the content of the web site with which CKA will be used. If the contents of ServerRoot won't test CKA very well, maybe the testing routines can add content that comes with the package to the test server, in such a way that it does not become a permanent part of the web site. I can't know whether that's possible without some digging, but I'll try. If this step is taken, the DB will be deleted so that it's not copied out for retention. If this step isn't possible, all that can be done is to note that the module hasn't been tested very well. Maybe it's the word "database" that has spooked you. It conjures up Oracle and PostGreSQL and who knows what all? A simple lookup hash from filename to number of JPG files isn't worthy of the name, so let's just call it a "table" :-)	[reply]
Re^7: RFC: Apache2::CloseKeepAlive by jethro (Monsignor) on Aug 28, 2009 at 22:18 UTC
Maybe it's the word "database" that has spooked you No, it was the image in my mind of an install routine doing a search on a big disk partition and a clueless admin sitting there wondering what on earth this install script is doing. I've brought up all those (often farfetched) examples not as specific grievances you should protect against, but just to show there are reasons why (some?) people want an install to be just an install. This is independent of the actual module. If I installed a library or module that could cache sound files I also wouldn't like the install routine to automatically search the disk for sound files and creating the cache. An application program maybe, but a module/library, definitely not. Or a module that creates rainbow tables for AES should not start to build the rainbow tables as part of the install, obviously ;-) Naturally, if you need a database for testing, that is different. But in that case the problem is that you have to provide the test data yourself as you can't expect any fitting test data on the machine. So if you have to provide the test data anyway, why make a time wasting and installation dependant search for additional data if you already got the test data? To make an example: I remember that there was a perl module for connecting over the network. For the testing it actually tried to access the internet. Which was bad, because it didn't test only the module. It also tested the whole network, the firewall and the internet connection, extrinsic features to the module. An installation to fail because of external circumstances like a turned off router sends a wrong signal. If your testing depends on data you find on the machine you get into the same danger of testing something external to your module. For example corrupted html files on the web server should not lead to test failures for your module	[reply]