Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Google Earth Monks

by McDarren (Abbot)
on Jul 02, 2006 at 12:27 UTC ( [id://558846] : monkdiscuss . print w/replies, xml ) Need Help??

Howdy :)

Over the past week or so, I've been playing around with Google Earth. Very very cool stuff.

Anyway, I thought it would be kinda neat to put Perlmonks on Google Earth, in a similar way that jcwren did with the Big Monk Map, and theorbtwo did with pmplanet. So I hacked together a script to pull the monk location data from tinymicros, and then pull in the other basic Monk data (such as XP level, date joined, number of writeups) from Perlmonks.

The end result is a KML KMZ (see update 2 below) placemark file which can be directly imported into Google Earth. For those that have Google Earth installed, or are sufficiently motivated to install it - the placemark file is available here.

At the moment it's fairly rudimentary, with Monks ascii-betically grouped into folders. Clicking on any single monk name brings up some basic info about that monk, and double-clicking zooms to their location (all standard Google Earth behaviour). I did manage to find a rather cute little "monk" to use as an icon for the placemarks :)

I'm a little bit hesitant to post the code I used, mainly because when run it will hit Perlmonks about 800 times, to gather the monk stats. And I don't think we'd suddenly want to have dozens of people doing that ;)

But basically, the code does the following:

  • Uses LWP::UserAgent to successively pull the Monk location data (Lat/Long) from tinymicros. Because there are around 800 monks that have supplied location data1, and I can only get 50 records at a time, this takes about 16 hits.
  • From the returned pages, it pulls the node ID, username, Lat and Long for each Monk.
  • It stuffs this data into a hash, keyed by Monk homenode ID, and uses Storable to save it to disk.
  • It then iterates through the list of Monks, and using a combination LWP::UserAgent and HTML::TokeParser pulls in the date joined, XP level, and number of writeups for each monk, from their homenode.
  • Finally, it writes the KML file.

I used Storable to save the data to disk, so that I could re-use the data on subsequent runs (whilst tweaking the script), and not have to continually scrape the data over and over again. This means that the data is already out of date - but shrug, at this stage it's just an exercise for a bit of fun and learning :)

The other thing that occurred to me is that something like this would probably be better integrated into jcwren's existing pmstats at tinymicros - especially as he is already pulling the required data on a daily basis.

It's also possible that somebody has already done this, and so I may have just re-invented another wheel. Although if they have - I couldn't find it.

So for now I just thought I'd throw this out there, and see what the reaction is.

Darren :)

Update 1. (2006-07-04): - I've discussed this with jcwren this morning, and he has supplied me with an XML feed directly from tinymicros for all this data (on a single page). So once I have the time (over the next few days), I'll provide a link to a daily updated KML file. Stay tuned! :)

Update 2. (2006-07-06): - Made a few changes over the past few days as follows:

  • - I'm using KMZ (compressed) format instead of KML for the monkfile. This has reduced the filesize from 701K to 47K.
  • - there are now coloured icons to represent each different monk level, and monks are grouped by level and ordered alphabetically within the levels (in the "places" menu)

Update 3. (2006-07-10): - This now has a "permanent" home. Two files are generated each day, and there are also a few screenshots.

Update 4. (2006-07-11): - The code.

1 For those that are not yet on the Big Monk Map, but would like to be - simply follow the instructions here

Replies are listed 'Best First'.
Re: Google Earth Monks
by davis (Vicar) on Jul 02, 2006 at 13:36 UTC

    So for now I just thought I'd throw this out there, and see what the reaction is.
    Very, very cool, McDarren! especially seeing as how Google Earth is now available for Linux. One minor nit (and I think this afflicts the Big Monk Map too, so it's possibly some shared code somewhere)... negative longitude values between 0 and -1 seem to get parsed as their absolute values.... e.g. my -0.33" position is interpreted as 0.33"
    Other than that, very very cool. ++

    Kids, you tried your hardest, and you failed miserably. The lesson is: Never try.
      "..negative longitude values between 0 and -1 seem to get parsed as their absolute values.."

      I guess I'll have to blame jcwren for that one :p
      If you have a look at the values on the MonkMap page (which is where I got the data from), you'll see that your longitude is given as 0.556111 (positive). Even though it's definitely entered as a negative value on your homenode.

      /me looks at jcwren... ;)

      Update (2006-07-04): I managed to speak to jcwren this morning, and he has fixed this problem. Apparently it was something to do with the way that perl handles a "negative 0" value.

Re: Google Earth Monks
by explorer (Chaplain) on Jul 02, 2006 at 16:38 UTC
Re: Google Earth Monks
by spiritway (Vicar) on Jul 03, 2006 at 04:12 UTC

    This is fun... but when I tried locating some monks, I discovered that they lived offshore, apparently in the ocean. Possibly something needs more work - but I couldn't identify whether it was with Google Earth, converting between formats, or what... Of course, it might be that these monks work on offshore oil rigs or boats or something.

      Yeah, could be many reasons for this. Most likely one is that the location data supplied by the individual monks isn't that accurate. For example, when I checked my own I was also out in the ocean. The good thing about Google Earth is that you can use it (in many areas) to zoom right into the building you work/live in and get an accurate fix (which I did with mine, and updated my home node).

      One other possibility is that some Monks (such as tye) like to play funny buggers and pretend that they live at the South Pole, or other similarly "exotic" locations ;)

      Of course, some others may not be comfortable with the idea of supplying "precise" location data - which is fair enough also.

      Darren :)

        I checked myself and I'm less than 500 meters away from where I really am. Pretty good actually :)

        Hi, [id://McDarren]. Yes, that's how I got my own coordinates - I used Google Earth. Homed right in on my own building (I could see myself waving as I looked on). Had to translate a bit, and I'm sure there was some rounding error, but I'm pretty close. I find this utterly fascinating...

        I don't blame [id://tye] for wanting to live at the South Pole. Rents are cheap, though it's really *really* expensive to run a T1 line out there.

      But that is where SpongeBob lives!

      DWIM is Perl's answer to Gödel
Re: Google Earth Monks
by petdance (Parson) on Jul 03, 2006 at 06:56 UTC
    Haven't seen your code, but if you're extracting links from the monk page, take a look at using WWW::Mechanize and having it help you out on that. Should make your code simpler.


      Thanks :)

      Actually, I've never used WWW::Mechanize, so it didn't occur to me to try that. The routine I use for scraping the data from the Monk homenodes is given below. I think the main performance hit is the fact that I need to issue a separate request for each Monk. Ideally, it would be good to be able to grab all this information in a single go. But I'm not aware of any way that this is currently possible.

      sub get_monk_stats { my $ref = shift; my $monk_url = ''; my %monk_fields = ( 'User since:' => 1, 'Last here:' => 1, 'Experience:' => 1, 'Level:' => 1, 'Writeups:' => 1, ); MONK: foreach my $id (keys %{$ref}) { print "Getting data for $ref->{$id}{name} ($id)\n"; my $ua = LWP::UserAgent->new(); my $req = HTTP::Request->new(GET=>"$monk_url$id"); my $result = $ua->request($req); next MONK if !$result->is_success; my $content = $result->content; my $p = HTML::TokeParser->new(\$content); while (my $tag = $p->get_tag("td")) { my $text = $p->get_trimmed_text("/td"); if ($monk_fields{$text}) { $p->get_tag("td"); $ref->{$id}{$text} = $p->get_trimmed_text("/td"); } } } return $ref; }
        Ideally, it would be good to be able to grab all this information in a single go. But I'm not aware of any way that this is currently possible.

        You can work in parallel using POE::Component::Client::HTTP. Check it out.

        David Serrano