Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
All:
Judging by the number of posts in Newest Nodes a few days ago, it seemed like a slow day. I asked in the CB if anyone had done statistical analysis on the number and type of posts on any given day. The general response was no, but all the tools are there to do it if you want. I really didn't want to. I wanted to be lazy and enjoy someone else's work.

I was provided with a 24MB 51MB XML file (thanks James) that sat in my inbox until last night. Seeing an opportunity to improve my non-existant SQL and CGI skills, I decided to load the information into a database, create a few nifty queries, and generate some HTML reports. It took me a lot longer than I expected to even get started because to call my SQL fu non-existant was far too kind.

Using the "eyes were bigger than my stomach" analogy, I quickly found myself in over my head. The ideas I had, while all possible, seemed like they would take far more work than the benefit I would get from them. Besides, judging from the Christmas Report, the 24MB XML file didn't contain all the records anyway.

This is where you come in. Take a look at this and tell me what you think. tye and diotalevi helped me get this far. If you feel that continuing on would be a worth while endeavour than I will. If you have any ideas for the type of reports you would like to see, please let me know.

Each record contains the following fields:

  • Year
  • Month
  • Day
  • Hour
  • Type (PMDiscussion, Poetry, etc)
  • Root (to determine type and if sub-note)
  • Day of Week
  • Holiday (I used US Federal Holidays + Valentines)

    Cheers - L~R

    Update: It turned out that I originally received an old file, but have since rebuilt the database and populated current information.

    Update 2: Thanks to diotalevi it now takes 1 minute instead of 1 hour to rebuild the database from scratch (yeah COPY). He already pointed out that the hours should be adjusted to GMT and James mentioned that this may require using two different zones since the Monastery has moved since its inception. Rest assured when I get time I will be doing this. I also removed "system" nodes from the statistics as tachyon feels that the average number of root nodes is twice the actual average.


    In reply to More PM Stats by Limbic~Region

    Title:
    Use:  <p> text here (a paragraph) </p>
    and:  <code> code here </code>
    to format your post; it's "PerlMonks-approved HTML":



    • Are you posting in the right place? Check out Where do I post X? to know for sure.
    • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
      <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
    • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
    • Want more info? How to link or How to display code and escape characters are good places to start.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others wandering the Monastery: (2)
    As of 2024-04-20 04:24 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found