To amplify a bit on what Ryszard said you need to look at what you are trying to do with a project management type of methodology. The steps you need to take are:

  1. Gather requirements
  2. Measure what you have
  3. Formulate a transisition plan
  4. Get signoff from the stakeholders
  5. Fine tune the plan based on feed back from step 4
  6. Iterate #4 & #5 as needed
  7. Set a schedule with milestones
  8. Execute the plan. Changes to the plan at this point need to be re-negotiated with a setting of expectations (see #4 and #5) as needed.
  9. DOCUMENT EVERYTHING
  10. Wrap up the project with a "what went well and what went wrong" session
A consequence of all this should be a set of publication standards for those who are contributing pages to the site so they can be properly indexed.

Now: implementation wise the kinds of technologies you should be looking at would include some sort of spider that looks at meta tags within the pages to build a database of those tags indexed against titles and authors. The meta tags and how they are used would be part of the standards you establish and those pages that don't conform to the standards don't get indexed. The consequence of not being indexed should be the "management's" call and part of the expectation setting that needs to take place.

Generating the index page (or pages) can be done several different ways. Run a job once a day that spiders the site and compiles the indices. That much stays pretty constant. The part that is subject to implementation preference is wheather you generate a static HTML page as a result of the spidering OR use HTML::Mason, PHP, or other dynamic web page technology is in small part a case of how big a load you expect the web server to have to deal with, how "hefty" the machine is, personal preference and possibly political considerations.

When I say political considerations I worked in a shop for a while that had "Core Engineering Standards" that in a draconian fashion dictated what technologies were approved for use on company machines and you were not allowed to even suggest a technology not in the CES document. Something else to consider before making big plans.

Humph... wish I wasn't under NDA or I'd give you the link of the site I did all this work for... :-)


Peter @ Berghold . Net

Sieze the cow! Bite the day!

Nobody expects the Perl inquisition!

Test the code? We don't need to test no stinkin' code!
All code posted here is as is where is unless otherwise stated.

Brewer of Belgian style Ales


In reply to Re: Creating a Directory Site by blue_cowdawg
in thread Creating a Directory Site by CodeJunkie

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.