PostgreSQL has stored procedures in perl (PL/Perl - Procedural Language/Perl).
Here is one example that retrieves the saints-page and shows the columns (sorry; I just parsed the html with a regex):
It needs plperlu (the 'u' stands for 'untrusted' as it lifts certain security limitations. It can be installed with: "CREATE LANGUAGE plperlu;").
I guess security is out the window when you retrieve stuff from remote sites straight into your database -- but this is just a demonstration. Be careful out there.
create function saints_in_our_book()
returns table (
rank integer
, userid integer
, username text
, xp integer
, userlevel text
, writeups integer
, usersince timestamp
, lasthere text
) language plperlu
as
$plp$
use strict;
use warnings;
use LWP::Simple;
use IO::String;
my $url = "http://perlmonks.org/?node_id=3559";
my $io = IO::String->new(get($url));
while ( <$io> ) {
chomp;
if(m,
^
<tr(?:[ ]class="highlight")?>
<td[ ]align="right">([0-9]+)</td> #
+ rank
<td>
<span[ ]class="user-([0-9]+)"> #
+ userid
<a[ ]href="[?]node_id=[0-9]+">
([^<]+) #
+ username
</a>
</span>
</td>
<td[ ]align="right">([0-9]+)</td> #
+ xp
<td>([a-zA-Z]+\s+\([0-9]+\))</td> #
+ userlevel
<td[ ]align="right">
<a[ ]href="[?]node_id=6364;usersearch=[^"]+">
([0-9]+) #
+ writeups
</a>
</td>
<td[ ]align="right">([^<]+)</td> #
+ usersince
<td[ ]align="right">([^<]+)</td> #
+ lasthere
</tr>
$
,x
)
{
return_next ( {
rank => $1
, userid => $2
, username => $3
, xp => $4
, userlevel => $5
, writeups => $6
, usersince => $7
, lasthere => $8
} );
}
}
return undef;
$plp$;
Now you can query that remote HTML page with SQL. It is slow, of course; I guess it only makes sense to do this if you save it into a table (see CREATE TABLE t AS ...) or a materialized view (see CREATE MATERIALIZED VIEW mv AS ...).
The SQL query below shows the top 50 rows:
$ echo "
select
rank
, userid
, username
, xp
, userlevel
, writeups
, usersince
, lasthere
from saints_in_our_book()
where rank <= 50
;
" | psql -qX
rank | userid | username | xp | userlevel | wr
+iteups | usersince | lasthere
------+--------+----------------------+---------+-----------------+---
+-------+---------------------+----------------
1 | 979 | vroom | 1007449 | Pope (28) |
+ 607 | 1999-11-12 05:53:00 | 37 weeks ago
2 | 171588 | BrowserUk | 156800 | Pope (28) |
+ 22020 | 2002-06-04 18:12:00 | 13 minutes ago
3 | 381608 | ikegami | 126952 | Pope (28) |
+ 19570 | 2004-08-10 14:55:00 | 15 hours ago
4 | 5348 | Corion | 117813 | Pope (28) |
+ 9369 | 2000-03-14 13:56:00 | 4 hours ago
5 | 9073 | merlyn | 71106 | Sage (25) |
+ 6322 | 2000-04-25 22:41:00 | 5 weeks ago
6 | 461912 | GrandFather | 70930 | Sage (25) |
+ 6450 | 2005-05-31 03:02:00 | 8 seconds ago
7 | 22609 | tye | 69775 | Cardinal (24) |
+ 7769 | 2000-07-14 22:45:00 | 13 hours ago
8 | 616540 | moritz | 65723 | Cardinal (24) |
+ 6415 | 2007-05-21 12:33:00 | 23 hours ago
9 | 17000 | Ovid | 60423 | Cardinal (24) |
+ 2990 | 2000-06-08 04:57:00 | 1 week ago
10 | 85580 | dragonchild | 58436 | Archbishop (23) |
+ 5988 | 2001-06-04 23:41:00 | 4 days ago
11 | 1382 | chromatic | 58205 | Archbishop (23) |
+ 5941 | 1999-12-24 11:10:00 | 5 weeks ago
12 | 26179 | tilly | 57174 | Archbishop (23) |
+ 5646 | 2000-08-04 16:59:00 | 1 year ago
13 | 131741 | zentara | 56223 | Archbishop (23) |
+ 7239 | 2001-12-14 00:14:00 | 40 weeks ago
14 | 281137 | davido | 55541 | Archbishop (23) |
+ 4698 | 2003-08-05 19:52:00 | 16 hours ago
15 | 82147 | Zaxo | 54786 | Archbishop (23) |
+ 3416 | 2001-05-22 05:34:00 | 6 years ago
16 | 352046 | ww | 45944 | Bishop (22) |
+ 3539 | 2004-05-10 13:43:00 | 13 hours ago
17 | 290905 | Old_Gray_Bear | 45468 | Bishop (22) |
+ 587 | 2003-09-12 02:59:00 | 11 hours ago
18 | 29008 | grinder | 43661 | Bishop (22) |
+ 1570 | 2000-08-22 17:56:00 | 6 years ago
19 | 198160 | CountZero | 42943 | Bishop (22) |
+ 4284 | 2002-09-16 05:38:00 | 3 hours ago
20 | 324763 | marto | 42094 | Bishop (22) |
+ 4470 | 2004-01-28 20:01:00 | 24 hours ago
21 | 622051 | toolic | 41484 | Bishop (22) |
+ 3483 | 2007-06-19 16:21:00 | 10 hours ago
22 | 169744 | Abigail-II | 40866 | Bishop (22) |
+ 4089 | 2002-05-28 11:53:00 | 6 years ago
23 | 18800 | jeffa | 40192 | Bishop (22) |
+ 2764 | 2000-06-19 19:14:00 | 16 hours ago
24 | 22308 | dws | 38846 | Chancellor (21) |
+ 2536 | 2000-07-13 05:09:00 | 35 weeks ago
25 | 180961 | Limbic~Region | 38012 | Chancellor (21) |
+ 2960 | 2002-07-11 06:53:00 | 1 week ago
26 | 104919 | perrin | 36242 | Chancellor (21) |
+ 4171 | 2001-08-15 04:13:00 | 2 years ago
27 | 114691 | Aristotle | 35627 | Chancellor (21) |
+ 5335 | 2001-09-26 01:49:00 | 20 weeks ago
28 | 2680 | davorg | 34015 | Chancellor (21) |
+ 2988 | 2000-02-02 01:01:00 | 4 weeks ago
29 | 58196 | Fletch | 33799 | Chancellor (21) |
+ 3608 | 2001-02-13 23:08:00 | 2 days ago
30 | 80749 | tachyon | 32350 | Chancellor (21) |
+ 3268 | 2001-05-16 03:26:00 | 10 years ago
31 | 186362 | adrianh | 32159 | Chancellor (21) |
+ 2205 | 2002-07-30 20:57:00 | 2 years ago
32 | 44715 | graff | 31661 | Chancellor (21) |
+ 3880 | 2000-12-04 02:04:00 | 2 weeks ago
33 | 108447 | demerphq | 31508 | Chancellor (21) |
+ 3269 | 2001-08-28 18:16:00 | 1 year ago
34 | 248054 | Your Mother | 30978 | Chancellor (21) |
+ 2536 | 2003-04-04 13:42:00 | 14 hours ago
35 | 534893 | Gavin | 30799 | Chancellor (21) |
+ 598 | 2006-03-07 12:15:00 | 2 hours ago
36 | 533863 | roboticus | 30531 | Chancellor (21) |
+ 2888 | 2006-03-02 12:31:00 | 7 hours ago
37 | 176576 | eyepopslikeamosquito | 30092 | Chancellor (21) |
+ 1295 | 2002-06-23 06:02:00 | 13 minutes ago
38 | 442602 | planetscape | 30055 | Chancellor (21) |
+ 1290 | 2005-03-27 09:28:00 | 2 days ago
39 | 524150 | syphilis | 29621 | Canon (20) |
+ 2714 | 2006-01-19 07:30:00 | 4 hours ago
40 | 157432 | Joost | 29453 | Canon (20) |
+ 3245 | 2002-04-08 13:32:00 | 3 years ago
41 | 708738 | LanX | 29231 | Canon (20) |
+ 4599 | 2008-09-03 12:42:00 | 3 hours ago
42 | 1936 | japhy | 29144 | Canon (20) |
+ 2347 | 2000-01-10 19:44:00 | 42 weeks ago
43 | 224409 | borisz | 27898 | Canon (20) |
+ 913 | 2003-01-05 14:23:00 | 3 weeks ago
44 | 190859 | bart | 27165 | Canon (20) |
+ 2611 | 2002-08-17 14:36:00 | 51 weeks ago
45 | 528646 | Khen1950fx | 26518 | Canon (20) |
+ 1899 | 2006-02-07 22:12:00 | 12 hours ago
46 | 510280 | shmem | 26470 | Canon (20) |
+ 2519 | 2005-11-20 20:58:00 | 19 hours ago
47 | 421114 | Tanktalus | 26170 | Canon (20) |
+ 2178 | 2005-01-10 22:07:00 | 1 day ago
48 | 123953 | strat | 26010 | Canon (20) |
+ 538 | 2001-11-08 03:38:00 | 2 years ago
49 | 832495 | choroba | 25807 | Canon (20) |
+ 3418 | 2010-04-02 15:13:00 | 16 minutes ago
50 | 194920 | diotalevi | 25716 | Canon (20) |
+ 3030 | 2002-09-03 22:06:00 | 3 years ago
(50 rows)
UPDATE:Some lines are missing as I apparently made some error in the regex; I haven't the time to fix it now (maybe later).
UPDATE 2: fixed. (error was: while (<$io>) { and then $io->getline *facepalm*)
UPDATE 3: I added 'User Since' and 'Last Here' columns, and a computed column 'away' (=lasthere cast to postgres data type interval).
-- clear earlier stuff
drop table if exists saints_in_our_book_20150801 cascade;
-- create the current html page download into a table:
create table saints_in_our_book_20150801 as select * from sa
+ints_in_our_book();
-- create a view out of that table (and add the 'away' colum):
create or replace view saints_in_our_book as select *, replac
+e(lasthere, ' ago', '')::interval as away from saints_in_our_book_201
+50801;
-- query the view:
$ echo "
select *
from saints_in_our_book -- the view on top of the table
where away < '4 hours'
order by
now() - away desc
, xp desc
;
" | psql -qX
rank | userid | username | xp | userlevel | wr
+iteups | usersince | lasthere | away
------+---------+----------------------+--------+-----------------+---
+-------+---------------------+----------------+----------
284 | 399498 | erix | 6400 | Vicar (15) |
+ 362 | 2004-10-15 13:22:00 | 4 seconds ago | 00:00:04
6 | 461912 | GrandFather | 70930 | Sage (25) |
+ 6450 | 2005-05-31 03:02:00 | 5 seconds ago | 00:00:05
108 | 70929 | atcroft | 15133 | Monsignor (18) |
+ 633 | 2001-04-09 08:59:00 | 12 seconds ago | 00:00:12
80 | 401112 | johngg | 18402 | Abbot (19) |
+ 1682 | 2004-10-21 10:23:00 | 17 seconds ago | 00:00:17
72 | 634253 | AnomalousMonk | 20401 | Abbot (19) |
+ 2819 | 2007-08-22 06:27:00 | 52 seconds ago | 00:00:52
634 | 310300 | wjw | 3212 | Curate (13) |
+ 371 | 2003-11-26 16:34:00 | 1 minute ago | 00:01:00
2 | 171588 | BrowserUk | 156800 | Pope (28) |
+ 22020 | 2002-06-04 18:12:00 | 5 minutes ago | 00:05:00
49 | 832495 | choroba | 25807 | Canon (20) |
+ 3418 | 2010-04-02 15:13:00 | 10 minutes ago | 00:10:00
37 | 176576 | eyepopslikeamosquito | 30092 | Chancellor (21) |
+ 1295 | 2002-06-23 06:02:00 | 13 minutes ago | 00:13:00
287 | 1001958 | karlgoethebier | 6345 | Vicar (15) |
+ 925 | 2012-11-02 12:43:00 | 20 minutes ago | 00:20:00
561 | 905403 | tangent | 3558 | Curate (13) |
+ 363 | 2011-05-18 02:41:00 | 21 minutes ago | 00:21:00
254 | 491819 | ioannis | 6917 | Vicar (15) |
+ 131 | 2005-09-14 11:10:00 | 27 minutes ago | 00:27:00
302 | 885521 | hippo | 6076 | Vicar (15) |
+ 505 | 2011-02-01 15:37:00 | 27 minutes ago | 00:27:00
350 | 221725 | poj | 5228 | Priest (14) |
+ 623 | 2002-12-22 10:42:00 | 27 minutes ago | 00:27:00
325 | 749850 | VinsWorldcom | 5667 | Vicar (15) |
+ 326 | 2009-03-11 12:42:00 | 38 minutes ago | 00:38:00
267 | 436161 | davies | 6628 | Vicar (15) |
+ 446 | 2005-03-03 11:39:00 | 42 minutes ago | 00:42:00
132 | 992570 | Laurent_R | 12485 | Monsignor (18) |
+ 1850 | 2012-09-09 08:56:00 | 48 minutes ago | 00:48:00
110 | 200365 | Tux | 14738 | Monsignor (18) |
+ 1440 | 2002-09-24 13:49:00 | 1 hour ago | 01:00:00
295 | 919638 | RichardK | 6195 | Vicar (15) |
+ 396 | 2011-08-10 11:34:00 | 1 hour ago | 01:00:00
35 | 534893 | Gavin | 30799 | Chancellor (21) |
+ 598 | 2006-03-07 12:15:00 | 2 hours ago | 02:00:00
71 | 968231 | Athanasius | 20547 | Abbot (19) |
+ 1233 | 2012-05-01 11:13:00 | 2 hours ago | 02:00:00
147 | 747201 | afoken | 11033 | Prior (17) |
+ 1139 | 2009-02-28 20:01:00 | 2 hours ago | 02:00:00
184 | 341121 | dave_the_m | 8997 | Parson (16) |
+ 849 | 2004-03-30 22:38:00 | 2 hours ago | 02:00:00
331 | 1015412 | soonix | 5615 | Vicar (15) |
+ 290 | 2013-01-25 21:42:00 | 2 hours ago | 02:00:00
367 | 222702 | parv | 5011 | Priest (14) |
+ 690 | 2002-12-28 08:05:00 | 2 hours ago | 02:00:00
19 | 198160 | CountZero | 42943 | Bishop (22) |
+ 4284 | 2002-09-16 05:38:00 | 3 hours ago | 03:00:00
39 | 524150 | syphilis | 29621 | Canon (20) |
+ 2714 | 2006-01-19 07:30:00 | 3 hours ago | 03:00:00
41 | 708738 | LanX | 29231 | Canon (20) |
+ 4599 | 2008-09-03 12:42:00 | 3 hours ago | 03:00:00
74 | 295576 | ambrus | 19603 | Abbot (19) |
+ 2740 | 2003-10-01 11:47:00 | 3 hours ago | 03:00:00
88 | 446266 | salva | 17322 | Abbot (19) |
+ 2243 | 2005-04-09 17:47:00 | 3 hours ago | 03:00:00
576 | 1039428 | Eily | 3498 | Curate (13) |
+ 278 | 2013-06-17 17:59:00 | 3 hours ago | 03:00:00
(31 rows)
Now I am at the top of 'Saints in our book' ;)