Querying Saints in our Book (2)

Extending Querying Saints in our Book, the script now includes Monk Levels, which helps when trying to group or order the data. More importantly, (if it indeed can be important) the query now includes XP Ratio, which is calculated by XP / Writeups.

Fabricated numbers of some monks and monks with no writeups are outliers that mess up the numbers. So, the query returns two sets of ratios, with and without exclusions. Perhaps a more complicated rule for outliers can be devised.

// ==UserScript==
// @name        Perlmonks Saints in our Book to Oracle
// @namespace   http://www.example.com/chacham/
// @description Turn the data in Perlmonks Saints in our Book into an 
+Oracle query with CTEs
// @include     http://perlmonks.org/?node_id=3559
// @version     1.1.1
// @grant       none
// ==/UserScript==

var query = "WITH\
\n\t-- From: http://perlmonks.org/?node_id=509805\
\n\tMonk_Level(Id, XP_Low, XP_High, Title)\
\nAS\
\n\t(\
\n\t SELECT 13, 003000, 03999, 'Curate'    FROM Dual UNION ALL\
\n\t SELECT 14, 004000, 05399, 'Priest'    FROM Dual UNION ALL\
\n\t SELECT 15, 005400, 06999, 'Vicar'    FROM Dual UNION ALL\
\n\t SELECT 16, 007000, 08999, 'Parson'    FROM Dual UNION ALL\
\n\t SELECT 17, 009000, 11999, 'Prior'    FROM Dual UNION ALL\
\n\t SELECT 18, 012000, 15999, 'Monsignor'    FROM Dual UNION ALL\
\n\t SELECT 19, 016000, 21999, 'Abbot'    FROM Dual UNION ALL\
\n\t SELECT 20, 022000, 29999, 'Canon'    FROM Dual UNION ALL\
\n\t SELECT 21, 030000, 39999, 'Chancellor'    FROM Dual UNION ALL\
\n\t SELECT 22, 040000, 49999, 'Bishop'    FROM Dual UNION ALL\
\n\t SELECT 23, 050000, 59999, 'Archbishop'    FROM Dual UNION ALL\
\n\t SELECT 24, 060000, 69999, 'Cardinal'    FROM Dual UNION ALL\
\n\t SELECT 25, 070000, 79999, 'Sage'    FROM Dual UNION ALL\
\n\t SELECT 26, 080000, 89999, 'Saint'    FROM Dual UNION ALL\
\n\t SELECT 27, 090000, 99999, 'Apostle'    FROM Dual UNION ALL\
\n\t SELECT 28, 100000, NULL, 'Pope'    FROM Dual\
\n\t),\
\n\tData(Ranking, Name, XP, Writeups)\nAS\n\t(";
var table = document.getElementsByTagName('table')[4];

for(var row = 1; row < table.rows.length; row++)
 query += "\n\t SELECT "
  + table.rows[row].cells[0].textContent + ",\t'"
  + table.rows[row].cells[1].textContent + "',\t"
  + table.rows[row].cells[2].textContent + ",\t"
  + table.rows[row].cells[4].textContent.replace('None', 0) + "\t FROM
+ Dual UNION ALL";

// Chop off the extra UNION ALL
query = query.slice(0, -10)
+ "\n\t),\
\n\tData_Level(Monk_Level, Title, Ranking, Name, XP, Writeups)\
\nAS\
\n\t(\
\n\t SELECT\
\n\t\tMonk_Level.Id,\
\n\t\tMonk_Level.Title,\
\n\t\tData.Ranking,\
\n\t\tData.Name,\
\n\t\tData.XP,\
\n\t\tData.Writeups\
\n\t FROM\
\n\t\tData,\
\n\t\tMonk_Level\
\n\t WHERE\
\n\t\tData.XP\tBETWEEN Monk_Level.XP_Low AND Monk_Level.XP_High\
\n\t   OR\tData.XP >\tMonk_Level.XP_Low AND Monk_Level.XP_High IS NULL
+\
\n\t),\
\n\tData_Level_Calc(Monk_Level, Title, Ranking, Name, XP, Writeups, XP
+_Ratio, Exclude)\
\nAS\
\n\t(\
\n\t SELECT\
\n\t\tMonk_Level,\
\n\t\tTitle,\
\n\t\tRanking,\
\n\t\tName,\
\n\t\tXP,\
\n\t\tWriteups,\
\n\t\tXP / CASE Writeups WHEN 0 THEN 1 ELSE Writeups END,\
\n\t\t-- Monks with fabricated numbers or 0 Writeups mess up the calcu
+lations.\
\n\t\tCASE WHEN Ranking > 1 AND Writeups > 0 THEN 1 END\
\n\t FROM\
\n\t\tData_Level\
\n\t),\
\n\tXP_Ratio_By_Level(Monk_Level, Title, Min, Max, Avg, Min_X, Max_X, 
+Avg_X)\
\nAS\
\n\t(\
\n\t SELECT\
\n\t\tMonk_Level,\
\n\t\tTitle,\
\n\t\tMIN(XP_Ratio),\
\n\t\tMAX(XP_Ratio),\
\n\t\tAVG(XP_Ratio),\
\n\t\t-- Exclude from functions by NULLing out the number.\
\n\t\tMIN(XP_Ratio * Exclude),\
\n\t\tMAX(XP_Ratio * Exclude),\
\n\t\tAVG(XP_Ratio * Exclude)\
\n\t FROM\
\n\t\tData_Level_Calc\
\n\t GROUP BY\
\n\t\tMonk_Level,\
\n\t\tTitle\
\n\t UNION ALL\
\n\t SELECT\
\n\t\tNULL,\
\n\t\t'(All)',\
\n\t\tMIN(XP_Ratio),\
\n\t\tMAX(XP_Ratio),\
\n\t\tAVG(XP_Ratio),\
\n\t\tMIN(XP_Ratio * Exclude),\
\n\t\tMAX(XP_Ratio * Exclude),\
\n\t\tAVG(XP_Ratio * Exclude)\
\n\t FROM\
\n\t\tData_Level_Calc\
\n\t),\
\n\tStep_XP_Writeups(Ranking, Name, XP, Writeups, XP_Diff, Writeups_Di
+ff)\
\nAS\
\n\t(\
\n\t SELECT\
\n\t\tRanking,\
\n\t\tName,\
\n\t\tXP,\
\n\t\tWriteups,\
\n\t\tXP - LEAD(XP) OVER(ORDER BY Ranking),\
\n\t\tWriteups - LEAD(Writeups) OVER(ORDER BY Ranking)\
\n\t FROM\
\n\t\tData\
\n),\
\n\tClose(Ranking, XP_Diff, Writeups_Diff)\
\nAS\
\n\t(\
\n\t SELECT\
\n\t\tRanking,\
\n\t\tXP - LEAD(XP) OVER(ORDER BY Ranking),\
\n\t\tWriteups - LEAD(Writeups) OVER(ORDER BY Ranking)\
\n\t FROM\
\n\t\tStep_XP_Writeups\
\n\t WHERE\
\n\t\tXP_Diff\tBETWEEN -30 AND 30\
\n\t  AND\tWriteups_Diff\tBETWEEN -5 AND 5\
\n),\
\n\tMMA(XP_Diff_Min, XP_Diff_Max, XP_Diff_Avg, Writeups_Diff_Min, Writ
+eups_Diff_Max, Writeups_Diff_Avg)\
\nAS\
\n\t(\
\n\t SELECT\
\n\t\tMIN(XP_Diff),\
\n\t\tMAX(XP_Diff),\
\n\t\tAVG(XP_Diff),\
\n\t\tMIN(Writeups_Diff),\
\n\t\tMAX(Writeups_Diff),\
\n\t\tAVG(Writeups_Diff)\
\n\t FROM\
\n\t\tStep_XP_Writeups\
\n\t),\
\n\tStats(Both, Close, XP, XP_Diff_Min, XP_Diff_Max, XP_Diff_Avg, Writ
+eups, Writeups_Diff_Min, Writeups_Diff_Max, Writeups_Diff_Avg)\
\nAS\
\n\t(\
\n\t SELECT\
\n\t\t(SELECT COUNT(COUNT(*)) FROM Data GROUP BY XP, Writeups    HAVIN
+G Count(*) > 1),\
\n\t\t(SELECT COUNT(*) FROM Close),\
\n\t\t(SELECT COUNT(COUNT(*)) FROM Data GROUP BY XP HAVING Count(*) > 
+1),\
\n\t\tXP_Diff_Min,\
\n\t\tXP_Diff_Max,\
\n\t\tXP_Diff_Avg,\
\n\t\t(SELECT COUNT(COUNT(*)) FROM Data GROUP BY Writeups HAVING Count
+(*) > 1) Writeups,\
\n\t\tWriteups_Diff_Min,\
\n\t\tWriteups_Diff_Max,\
\n\t\tWriteups_Diff_Avg\
\n\t FROM\
\n\t\tMMA\n\t)\
\n--SELECT Monk_Level, Title, ROUND(Min, 2) Min, ROUND(Max, 2) Max, RO
+UND(Avg, 2) Avg, ROUND(Min_X, 2) Min_X, ROUND(Max_X, 2) Max_X, ROUND(
+Avg_X, 2) Avg_X FROM XP_Ratio_By_Level ORDER BY Monk_Level\
\n--SELECT Both, Close, XP, XP_Diff_Min, XP_Diff_Max, XP_Diff_Avg, Wri
+teups, Writeups_Diff_Min, Writeups_Diff_Max, Writeups_Diff_Avg FROM S
+tats\
\nSELECT\
\n\tData.Ranking,\
\n\tData.Name,\
\n\tData.XP,\
\n\tData.Writeups\
\nFROM\
\n\tData,\n\tClose\
\nWHERE\
\n\tData.Ranking BETWEEN Close.Ranking AND Close.Ranking + 1\
\nORDER BY\
\n\tData.Ranking;";


var textarea = document.createElement('textarea');
textarea.cols = 100;
textarea.rows = 20;
textarea.value = query;

document.body.appendChild(textarea);
[download]

The current output of the (commented out) query is:

MONK_LEVEL TITLE             MIN        MAX        AVG      MIN_X     
+ MAX_X      AVG_X
---------- ---------- ---------- ---------- ---------- ---------- ----
+------ ----------
        13 Curate            .07       3654         53        .07     
+ 727.8      35.26
        14 Priest           4.66       4560      65.43       4.66     
+  4560      65.43
        15 Vicar            5.43       1604      47.12       5.43     
+  1604      47.12
        16 Parson           6.53       7619     129.17       6.53     
+ 371.8       31.9
        17 Prior            6.16     814.62      46.28       6.16     
+814.62      46.28
        18 Monsignor           5     243.02      33.02          5     
+243.02      33.02
        19 Abbot            4.03    1103.75      73.21       4.03    1
+103.75      73.21
        20 Canon            5.27     127.72      23.32       5.27     
+127.72      23.32
        21 Chancellor       6.45      55.35       14.3       6.45     
+ 55.35       14.3
        22 Bishop           9.51      78.21      21.93       9.51     
+ 78.21      21.93
        23 Archbishop       7.77      16.04       10.9       7.77     
+ 16.04       10.9
        24 Cardinal        10.26      20.19      15.22      10.26     
+ 20.19      15.22
        25 Sage             8.99      11.26      10.44       8.99     
+ 11.26      10.44
        28 Pope             6.48    1659.73     421.35       6.48     
+ 12.06       8.56
           (All)             .07       7619      61.85        .07     
+  4560      43.37
[download]

Notable, perhaps, is the average with exclusions is just over 43. Certainly more outliers can be excluded to "correct" that number. :)

The use of this data can be called into question, though it is data and data is fun, right?

FWIW, i occasionally update some useless data in Re: planetscape and me: 1282 nodes apiece. Though, i still haven't figured out why.

Comment on Querying Saints in our Book (2) Select or Download Code

Replies are listed 'Best First'.
Re: Querying Saints in our Book (2) by hippo (Bishop) on May 13, 2016 at 15:29 UTC
The curate with the somewhat dubious XP per post ratio of 0.07 has to be NodeReaper who must surely qualify as an outlier to be discounted. Nice work, though.	[reply]
Re^2: Querying Saints in our Book (2) by chacham (Prior) on May 13, 2016 at 16:31 UTC
Thanx. Right now, Nodereaper has 3867 XP and 57005 Writeups, so it's definitely the culprit. Adding the exclusion (to my local script) moves the minimum up to 3.92, an honor held by some Moron, and nudges up the average for Curate and All groups as well: MONK_LEVEL TITLE MIN MAX AVG MIN_X + MAX_X AVG_X ---------- ---------- ---------- ---------- ---------- ---------- ---- +------ ---------- 13 Curate .07 3654 53 3.92 + 727.8 35.43 14 Priest 4.66 4560 65.43 4.66 + 4560 65.43 15 Vicar 5.43 1604 47.12 5.43 + 1604 47.12 16 Parson 6.53 7619 129.17 6.53 + 371.8 31.9 17 Prior 6.16 814.62 46.28 6.16 +814.62 46.28 18 Monsignor 5 243.02 33.02 5 +243.02 33.02 19 Abbot 4.03 1103.75 73.21 4.03 1 +103.75 73.21 20 Canon 5.27 127.72 23.32 5.27 +127.72 23.32 21 Chancellor 6.45 55.35 14.3 6.45 + 55.35 14.3 22 Bishop 9.51 78.21 21.93 9.51 + 78.21 21.93 23 Archbishop 7.77 16.04 10.9 7.77 + 16.04 10.9 24 Cardinal 10.26 20.19 15.22 10.26 + 20.19 15.22 25 Sage 8.99 11.26 10.44 8.99 + 11.26 10.44 28 Pope 6.48 1659.73 421.35 6.48 + 12.06 8.56 (All) .07 7619 61.85 3.92 + 4560 43.44 [download] Good catch! I plan to add that after the weekend in a script update to oujs.	[reply] [d/l]
Re^2: Querying Saints in our Book (2) by chacham (Prior) on May 16, 2016 at 13:36 UTC
NodeReaper added to version 1.1.2. Had to change ranking to username, as ranking is not userid. But i knew that already... I'm hoping for more suggestions about whom to exclude or modify, as inactive users who get the overwhelming majority of their points via voting are not really representative of the ratio. OTOH, except for the egregious, the outliers probably do not mess up the ratio much anyway. I'm curious also if there is interest is displaying each monk's ratio and its relation to the level and all groups' ratios. The possibilities for useless information are endless.	[reply]
Re^3: Querying Saints in our Book (2) by 1nickt (Canon) on May 17, 2016 at 12:52 UTC
suggestions about whom to exclude or modify, as inactive users who get the overwhelming majority of their points via voting are not really representative of the ratio I added fields in "Ranking the Saints by XP Efficiency" to require having visited the site in the most recent N weeks, and having made a minimum number of posts. Did not add a filter to require having posted in the last N weeks, which would have improved the quality of the rankings even more, but I think it could be done. Also my script can be configured to deduct 0.5 XP per day in an attempt to get closer to the XP gained from posting rather than longevity. The way forward always starts with a minimal test.	[reply]
Re^4: Querying Saints in our Book (2) by chacham (Prior) on May 17, 2016 at 16:08 UTC
Re^5: Querying Saints in our Book (2) by 1nickt (Canon) on May 17, 2016 at 17:48 UTC
Some notes below your chosen depth have not been shown here
Re: Querying Saints in our Book (2) by LanX (Saint) on May 17, 2016 at 13:31 UTC
I'd exclude the chosen one who started with 1e6 XPs ahead of others ! ;-) Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply]
Re^2: Querying Saints in our Book (2) by chacham (Prior) on May 17, 2016 at 14:22 UTC
He was excluded from the beginning, as his numbers really mess up the average.	[reply]
Re^3: Querying Saints in our Book (2) by LanX (Saint) on May 17, 2016 at 14:52 UTC
ahhh ... `*_X` are the corrected numbers .... got it. You might be interested in Median and other techniques to eliminate outliers. You could even cluster different user types ( "voters" vs "posters" ) Personally I'd like to see a XP per response level stat ( `Re^n:` ), but accessing the DB is outside of your possibilities. Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply] [d/l] [select]
Re^4: Querying Saints in our Book (2) by chacham (Prior) on May 17, 2016 at 15:27 UTC
Re^5: Querying Saints in our Book (2) (updated) by LanX (Saint) on May 17, 2016 at 21:36 UTC
Some notes below your chosen depth have not been shown here


Perl-Sensitive Sunglasses
	PerlMonks