Getting the HTML and URI escaping right for creating links and labels is fun, not! But if you've got the right template to start from, it's not so bad. Here's a basic snippet. Handles whitespace correctly, as well as %, &, <, and >. Beware hand-rolled solutions which don't.
#!/usr/bin/perl use CGI qw(:all); use URI::Escape; print header, start_html("sample"), h1("sample"); print map { a({ -href => uri_escape($_)}, escapeHTML($_)), br } glob " +*"; print end_html;
Note that the HTML a shortcut does an escapeHTML on the HREF attribute automatically. The manual equivalent would be:
print '<A HREF="', escapeHTML(uri_escape($_)), '">', escapeHTML($_), '</A>';
Yes, all that escaping is necessary. Do not shirk.

Replies are listed 'Best First'.
Re: Dump a directory as links from CGI
by epoptai (Curate) on May 31, 2001 at 01:52 UTC
    merlyn,

    I've noticed that CGI::escape is similar to uri_escape and have been using it successfully to avoid using URI::Escape when CGI is already imported. My tests show that CGI::escape also encodes characters like ! and ? that uri_escape leaves alone.

    Here's the relevant source code from CGI.pm escape()

    $toencode=~s/([^a-zA-Z0-9_.-])/uc sprintf("%%%02x",ord($1))/eg;
    and here's the code from URI::Escape uri_escape()
    # Build a char->hex map for (0..255) { $escapes{chr($_)} = sprintf("%%%02X", $_); } $text =~ s/([^;\/?:@&=+\$,A-Za-z0-9\-_.!~*'()])/$escapes{$1}/g;
    My question: Is it safe to use CGI::escape instead of uri_escape?

    --
    Check out my Perlmonks Related Scripts like framechat, reputer, and xNN.

        Thanks for indicating the subtle difference.

        I've been struggling with this very issue recently and came across a url here at perlmonks that uri_escape doesn't handle, and requires CGI::escape. This small CGI program compares uri_escape and escape by producing two links to perlmonks:

        #!/usr/bin/perl use CGI qw(:all escape escapeHTML); use URI::Escape; $site = 'http://www.perlmonks.org/index.pl?node='; $user = 'Clive ;-)'; # my test case for whitespace and funny chrs print header,h1("compare"); print '<A HREF="', escapeHTML(uri_escape($site.$user)), '">', escapeHTML($user), '</A> - <tt>escapeHTML(uri_escape($site.$user))</tt><br>'; # does +n't work print '<A HREF="', escapeHTML($site.escape($user)), '">', escapeHTML($user), '</A> - <tt>escapeHTML($site.escape($user))</tt><p>'; # works
        The 1st print using uri_escape on the whole url returns:
        http://www.perlmonks.org/index.pl?node=Clive%20;-)

        The 2nd print using CGI::escape on the param only, yields: http://www.perlmonks.org/index.pl?node=Clive%20%3B-%29

        As you can see the semicolon is causing a problem in the uri_escape url.
        Is this a general condition or just a peculiarity of perlmonks?

        --
        Check out my Perlmonks Related Scripts like framechat, reputer, and xNN.