Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Get chatbox lines

by ZZamboni (Curate)
on May 26, 2000 at 20:29 UTC ( [id://15025]=sourcecode: print w/replies, xml ) Need Help??
Category: Utilities
Author/Contact Info ZZamboni
Description: It gets lines from the Perlmonks chatbox. It can return all the lines that are currently there, or only the new lines since the last time the getnewlines subroutine is called. This piece of code prints the chat to the terminal:
#!/usr/local/perl/bin/perl use PerlMonksChat; $p=PerlMonksChat->new(); while (1) { print map { "$_\n" } $p->getnewlines(); sleep 10; }
It is very rough, but it works :-)
Update: The code posted here was only the first version, and is now grossly outdated. Please see the web page where I keep the latest version of the script. It has grown a lot with the contributions and encouragement of fellow monks. to it.
# Note - don't use this code. See link above.
package PerlMonksChat;

use LWP::UserAgent;
use HTTP::Request;
use HTML::Entities;

sub new {
  my $class=shift;
  my $url=shift||'';
  my $self={};
  $self->{ua}=new LWP::UserAgent;
  $self->{req}=new HTTP::Request('GET', $url);
  bless $self, $class;
  return $self;

sub getalllines {
  my $self=shift;
#  print "(* grabbing *)\n";
  my $response=$ua->request($req);

  if ($response->is_success) {
    my $c=$response->content;
    #  print $c;
    if ($c =~ /<td.*?Chatterbox.*?<input[^>]*?>(.*?)<input/msi) {
      my $chatline=$1;
      # Split in lines and remove html tags
      my @chatlines=grep { $_ }
        map { s/<[^>]+?>//g; decode_entities($_); $_ }
          split(/\s*<br>\s*/, $chatline);
      return @chatlines;
  else {
    return ("error");

sub getnewlines {
  my $self=shift;
  my $cache=$self->{cache};
  my @allines=$self->getalllines();
  my @newcache;
  # Don't use a regular cache, instead go back through them until we
  # find the first one that is in the cache.
  foreach (reverse @allines) {
    last if ($cache->[0] && $_ eq $cache->[0]);
    push @newcache, $_;
  # Add the new lines to the cache
  unshift @$cache, @newcache;
  # Trim the cache to the last 50 lines
  return reverse @newcache;

Replies are listed 'Best First'.
RE: Get chatbox lines
by swiftone (Curate) on May 26, 2000 at 20:35 UTC
    You shame me. I'll have to pour over it before I make a more detailed comment, but here's my version of a similar thing. I didn't make it into a package, and it still has a few problems.

    I just have it run in a window in the background so that I can lookup anything I miss when I don't reload fast enough. Comments appreciated!

    #!/usr/bin/perl -w use strict; use LWP::Simple; my($newmessages, $oldmessages); while(1){ get("")=~/<!-- nodelets start here -- +>(.*)/s; $_=$1; my(@nodelet)=split(/<!--Nodelet Break -->/); foreach (@nodelet){ if (/Chatterbox/){ s/\n//g; s/\r//g; while(m%(<b>&lt;</b>|<i>)<a href=[^>]*>([^<]*) +</a>(<b>&gt;</b>)?(.*?)(</i>)?<br>%ig){ if(!defined($oldmessages->{"$2:$4"})){ $newmessages->{"$2:$4"}=1; print "$2: $4\n"; } } last; } } undef $oldmessages; $oldmessages=$newmessages; undef $newmessages; sleep(15); }
      I've noticed it gets funkier and funkier as it runs. I suspect there is something wrong with the section where I check to make sure it isn't a repeat message.

      Concept: I snag the poster and comment from the line, then check to see if that key exists in the hash referenced by $oldmessages (basically using the hash as an easy lookup). If it isn't there, I drop it into $newmessages and print it out. Once I've gone through the page, I undef $oldmessages, and recreate it to point at $newmessages. Are there problems with this?

      And yes, I realize that any nodelet that mentions "Chatterbox" will screw this program. Not sure how to fix that without break theme independance.

      I like the way you get the chatterbox, by breaking into the nodelet units. I may do that instead of the match on the whole page I do right now.

      I think the problem with your cache is that you are resetting old messages every time. Here's what happens. Assume the first time through the chatbox has the following lines:

      user1: blah user2: bleh user3: blih
      So you add those to $newmessage, which then becomes $oldmessages. The next time, the box contains:
      user1: blah user2: bleh user3: blih user1: howdy
      As you go through the lines, the only one that gets added to $newmessages is the last one, because the others are already in $oldmessages.

      Then, and here is the problem, you remove $oldmessages! So the only message you have a record of is the last one. So the next time through, the first three messages are printed again, because they are no longer in the cache.

      I don't think you need the juggling of $old and $newmessages. You can just keep one hash where you cache all the messages. The problem with this (and the reason why I didn't do it that way in my code) is that you have no way of knowing which messages are older or newer, so unless you attach a timestamp to each entry, your cache will grow indefinitely. Furthermore, if the same user says the same thing in two different occasions, the second time through it will not be printed because your program will think it's a repeated message.

      Hope this helps,


        As you go through the lines, the only one that gets added to $newmessages is the last one, because the others are already in $oldmessages.

        Then, and here is the problem, you remove $oldmessages! So the only message you have a record of is the last one. So the next time through, the first three messages are printed again, because they are no longer in the cache.

        Good catch on the lost messages. Rather than your suggestion, however, I simply moved the line that places the message into the hash outside the if(!defined()) loop.

        This means that only the messages that showed up are cached...the page should never return old messages, and my cache remains a usable size (actually, much smaller than yours! :) )

        As for the repeat messages thing, that's a feature. Quite often repeated messages ARE repeats. With a limited cache, non-accidental repeat collisions should be rare.

        Thanks for the help, it's much cleaner now with that one little change. Still not nicely packaged for an interface layer like yours is, but I admit to a small attachment to code I have written.

RE: Get chatbox lines
by neshura (Chaplain) on May 26, 2000 at 21:10 UTC
    I love this utility - after checking out both scripts, I believe I prefer ZZamboni's implementation because it was trivial to add a line to handle a firewall.

    Very nice job!

    Postscript: it would be nice to throw in a little Tk so I don't have to devote an entire terminal...but maybe that would have dangerous implications...the chatterbox is already just AOL Instant Messager with a funny penguin hat.

    e-mail neshura

      Would you care to share that line?

      And yes, I was thinking Tk too. My idea is to have a module that you can use to get and post things to the chat. Then, using that module, you can implement any kind of user interface you want :-)



        I didn't put this in the constructor but that is where it should have gone. (quoth the non-expert)
        $ua->proxy(['http'], '');

        e-mail neshura

RE: Get chatbox lines
by mdillon (Priest) on May 26, 2000 at 20:47 UTC
    there's no need to use splice in a void context to trim the cache in your getnewlines sub. just use the following code:
    my $CACHE_LIMIT = 50; $#$cache = $CACHE_LIMIT - (1 - $[);
      The problem with this (now that I have tested it well) is that it enlarges the array when it is originally smaller than the limit. splice doesn't do that. Any ideas? Thanks,


        not besides just appending 'if @$cache > $CACHE_LIMIT';
Get chatbox lines - Gtk Version
by mdillon (Priest) on Jun 01, 2000 at 10:46 UTC
    This client has not yet been updated to use the XML feed, so until further notice, it should not be used. Check out the original instead.

    so, after a bit of work, my first Gtk-Perl application is ready. as i've said before, i had taken ZZamboni's code (along with a few bits from swiftone's posted version) and made some changes to create my own version. i now have my version working pretty well and i have created a pretty complete Gtk+ frontend and a minimal terminal frontend for monitoring (no posting).

    it is available here.

RE: Get chatbox lines Win32::GUI client
by Shendal (Hermit) on Jun 21, 2000 at 18:57 UTC
    Please check out my Win32::GUI Chatterbox Client. It uses and Win32::GUI to create an NT gui chatterbox client.

    Fully functional, using a separate server process (automatically spawned) to reduce lag. Supports XP, with progress bar. Userlist. Graphical login.

RE: Get chatbox lines
by mdillon (Priest) on May 27, 2000 at 10:05 UTC
    so, i've taken this code and made a preliminary interface using Gtk-Perl. it has a number of fatal flaws at the moment but i think it should have it working well quite soon.

    once i have it working fairly well, i'll post a URL here so people can download it and check it out.

RE: Get chatbox lines
by ZZamboni (Curate) on May 31, 2000 at 00:48 UTC
    Hey everyone,

    This module has kept growing. Thanks a lot to all the people who have contributed ideas and (most of all) code. Instead of posting the whole thing here, I'll just give you a link where you can find the latest version. It can now post messages, check off personal messages (/msg ones), etc. Enjoy,


RE: Get chatbox lines
by swiftone (Curate) on May 26, 2000 at 22:02 UTC
    In order to properly tweak this for the interface like you were saying, you should add the ability to identify sender, msg, and type of comment. Are you looking into that, or should I take a stab at it (Don't want us duplicating effort again :) )
      I'm looking at it, but I may not have much more time to devote to it today, and I'm leaving town for the extended weekend, so feel free to take at it :-)


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://15025]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2024-04-23 00:44 GMT
Find Nodes?
    Voting Booth?

    No recent polls found