Domino Twisty Expansion using Lightweight Proxy

inman has asked for the wisdom of the Perl Monks concerning the following question:

Anyone who has used Domino (a.k.a Lotus Notes) to produce and serve documents using the HTTP service will know all about the problem that I am about to describe. Have pity on me fellow monks and help me out if you can!

There is a feature in Lotus Notes called a 'twisty' which is basically a control that allows a section of a document to be hidden or shown. The easiest way to think about this is a <readmore> tag. A twisty can contain further twisty sections etc. This works fine in a Lotes Notes client since the whole document is loaded and there are functions that allow you to expand all of the twisty sections.

In a web client (served by Domino R5) however the implementation of a twisty is less straight forward. The document that is outside of the twisty is displayed with each twisty being a link back to the original document. Each link has a set of arguments that describe which sections of the document should be shown. The argument structure looks something like:

ExpandSection=2,1,1_1

This shows that sections 1, 2 and the first section inside of section 1 are expanded. Showing all of the sections is simply a question of constructing the correct argument. The problem is that the sections do not always start at 1 (there are often sections that are hidden from public view) and that there are an unknown number of sections, sub-sections, sub-sub-sections etc.

My proposed solution
I want to create a CGI script that acts as a lightweight proxy for Domino pages. The idea being that I can append the actual location of the content (full Domino URL) and do the twisty expansion by analysing the links in the document and requesting further documents until all of the twisties are expanded. The user (in this case the 'user' is actually a search engine spider) will just see the whole document without having to expand any twisties.

My first step has been to use LWP and HTML::LinkExtor to get the document from Domino and get the links. The further steps that I need to take are to process the links in order to compile the arguments for the new URL. This step will be repeated until the document is expanded. I will add additional functionality to handle authorization (basic and cookie based) when I get over the link parsing. The code so far is lifted from the HTML::LinkExtor POD and looks like this:

sub getLinks
{
    my $url = shift;
    my $ua = LWP::UserAgent->new;
    my $p = HTML::LinkExtor->new(\&callback);
    
    # Request document and parse it as it arrives
    my $res = $ua->request(HTTP::Request->new(GET => $url),
                  sub {$p->parse($_[0])});
    
    print  $res->code;
    # Expand all image URLs to absolute ones
    my $base = $res->base;
    
    @links = map { $_ = url($_, $base)->abs; } @links;
    return @links;
}

sub callback
{
    my($tag, %attr) = @_;
    return if $tag ne 'a';
    push(@links, values %attr);
}
[download]

Naturally - I have the usual feelings that a) this has been done before and b) there is a good chance that there is code out there already that solves the problem. Any help and advice is appreciated.

The real answer to the problem is to install and use Domino R6 which (as far as my research tells me) represents twisty sections as <DIV> sections with JavaScript doing the expanding / collapsing on the client browser. This solution is not available to me at this time since the number of Domino R5 servers is large and an upgrade is not planned for some time.

Thanks in advance.

Comment on Domino Twisty Expansion using Lightweight Proxy Select or Download Code

Replies are listed 'Best First'.
Re: Domino Twisty Expansion using Lightweight Proxy by diotalevi (Canon) on Feb 13, 2004 at 16:02 UTC
I've never seen someone write this before but that may not indicate much. Mostly I don't think this is the sort of thing a Domino developer would care about and so there isn't likely to be lots of code around to extract the contents of documents via the HTTP service. Actually, now that I think on this further there's a better way. Use the normal Domino APIs to extract the data without even bothering with the web service. You're going a very twisty way when you could either go get a RichText object, translate it to text (with optional formatting) and index that. Another idea is to just use the normal FullText index. It sounds like you are trying to avoid using Domino's pre-existing indexing service. Why? You'd have to be crazy to give it up - it already works really darn well. As for your current problem - I'd transform the HTML to XML with XML::LibXML and just use some XPath to fetch the expansion URLs. The in-place editing might be somewhat ugly so now I wonder if providing a more expansive URL command like ExpandAll might not be a better idea anyway.	[reply]
Re: Re: Domino Twisty Expansion using Lightweight Proxy by inman (Curate) on Feb 13, 2004 at 16:58 UTC
The search engine that I use to index the content has a connector for Lotus Notes that sucks the content out using the Notes API. This works very well for Domino servers that are hosted on our internal network. The Twisty problem still causes an issue when a document from a user search is viewed. At this time, the doc is generated as HTML by Domino and streamed through the search engine so that keyword highlights can be added etc. The users get annoyed when the search engine returns results that they can't immediately see because they are hidden in a twisty. Another problem is that there are a large number of Domino servers hosted elsewhere (external hosting provider etc.) that I cannot connect to using the Notes API. These servers must be indexed via HTTP and the representation of the 'ExpandSection' argument plays havoc with the indexing. The search spider views each version of the URL as a new document. The 'ExpandAll' argument is something that Domino web developers have been asking about for years. Lotus just couldn't do it in R5 and it can't be done using LotusScript. The Perl app that I am developing is the most recent in a long line of potential solutions.	[reply]
Re: Re: Re: Domino Twisty Expansion using Lightweight Proxy by diotalevi (Canon) on Feb 13, 2004 at 17:08 UTC
I was extrapolating ExpandAll from the view URLs - I've not tried to do it against a document before (but then our web standards generally don't put things like collapsable sections onto documents). Oh well. Good luck with the HTML fetching!	[reply]
Re: Domino Twisty Expansion using Lightweight Proxy by Zero_Flop (Pilgrim) on Feb 14, 2004 at 07:05 UTC
Depending on how many forms you have, you could generate a second form that does not have the twisties. This form has the same fields as the one with the twisties. Then when you are doing a search, use this form rather than the one normally associated with the document. You would basicly be using the DialogBox feature. R6 is nice, we are making our transition now. It has been transparent for the users but a little frustrating on the developers. More becouse of our own stupidity than anything else.	[reply]