Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

The problem is to maintain persistent session information without using cookies. The solution is to encode the session as a get parameters on all links that link to another internal page.

The second problem is that most of the web pages are maintained by somebody that just barely understands html. We don't want to need to teach them about sessions and stuff.

The solution is to take the html content and re-write all of the links prior to displaying the page. I'm using HTML::TreeBuilder to solve this. There are probably other ways.

my $owned_sites = qr/mysite\.(com|net|org)/i; sub add_sessions { my $root = HTML::TreeBuilder->new_from_content( shift() ); my $session = shift; foreach my $link ($root->look_down( '_tag', 'a' ) ) { next unless my $url = $link->attr('href'); if ( $url =~ m|://([^/]*)/| ) { next if ( $1 !~ $owned_sites ); } # Look for mailto: links. next if ( $url =~ m|^[^/]*:| ); my ( $path, $params ) = split /\?/, $url, 2; my %params = map { split( /=/, $_, 2 ) } split( /&/, $params ) +; $params{session} ||= $session; $url = join( '?', $path, join( '&', map { "$_=$params{$_}" } k +eys( %params ) ) ); $link->attr('href', $url); } my $html = $root->as_HTML; $root->delete(); return $html; }

Now, I just know somebody is going to tell me that I should be using URI::URL and that my session info is not going to be escaped, etc... But lets just consider that an excercise for another day. The point here is mainly to provide an example where HTML::TreeBuilder saves the day.


In reply to Re-write all internal links on a web page. by ehdonhon

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (8)
As of 2024-04-25 08:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found