bobafifi has asked for the wisdom of the Perl Monks concerning the following question:

Occasionally somebody posts a very long URL on my site.

For example:

Beautiful Yamaha 581 <flute_gal@msn.com> IL USA - http://www.larryandnatalie.com/isapi.dll/c/content/f/viewproperty/siteid/wb5KAM/contentclass/PICT/contentid/ZZZZZZVR/propertyname/Original/~/yamaha_58_page_11_jpg.jpg - Wednesday, March 26, 2003

Great sounding YLF 581. This is a professional solid silver flute with a CY headjoint, french open holes, pointed arms, white gold springs, inline G, and a C foot. I just had it serviced and adjusted and the pads are in great shape. This flute has a beautiful and bright sound. Asking price is $1200 USD.

=====

I'd like to modify my script to have a set value for the display, for example if the URL is over 53 characters long it gets changed to display the first 35 characters, then ... in the middle, then the last 15 characters. The above URL would then display like this in the browser with the original URL unchanged in the HTML:

http://larryandnatalie.com/isap...page_11_jpg.jpg

=====

The part of the my MySQL script that deals with the URL ($val5) looks like this:

my $page_link; if (defined ($val[5]) && $val[5] ne "") { # construct link if value is not NULL (undef) or empty $page_link = a ({-href => "$val[5]"}, $val[5]); } ### other variables here ### then print " - $page_link" if defined ($page_link);
=====

Anybody know how to do this?

Many thanks in advance,

-Bob

2003-04-06 edit by ybiC - add code and small tags

Replies are listed 'Best First'.
Re: Short URL?
by The Mad Hatter (Priest) on Apr 06, 2003 at 01:15 UTC
    Use substr. Also, testing if $val5 is defined and also not an empty string is redundant; I've changed it below.
    my $page_link; if ($val[5]) { # construct link if value is not NULL (undef) or empty my $linktext = $val[5]; $linktext = substr($linktext, 0, 35) . "..." . substr($linktext, - +15) if length($linktext) > 53; $page_link = a ({-href => $val[5]}, $linktext); } ### other variables here ### then print " - $page_link" if defined ($page_link);

    Update Fixed typo.

    Update2 Added note about if condition.

    Update3 Just noticed that in the root node the 5's are linked. Since they aren't in code blocks, I assumed that you typed $val[5] and they were turned into links without you realizing. Next time use the <code> tags. (OT, looking at all the updates I've done, I should probably stop popping those penguin mints... ;-)

      Thanks so much! Your fix works great :-) -Bob p.s. Thanks to everybody else on this post too -- I haven't had a chance to look at all your replies yet, but will. Thanks again.
Re: Short URL?
by hossman (Prior) on Apr 06, 2003 at 01:17 UTC
    A couple of tips to get you started...
    • You can use the length function to find out how long a string is.
    • You can use substr to get the first 35 characters. And to get the last 15.
    • Assuming you put the first 35 characters in $start and the last 15 characters in $end you can put your variables in quotes to get the new link label...
      my $label = "$begin...$end";
Re: Short URL?
by caedes (Pilgrim) on Apr 06, 2003 at 01:54 UTC
    All you need to do is to run the suspected long url through one regex. Eg.

    $url =~ s/^(.{35}).{3,}(.{15})$/$1...$2/;
    If the url isn't too long then the regex won't match and it will remain unchanged. Otherwise an elipsis will replace the overflow characters in the middle.

    Update: replaced an asterix with a plus so the elipsis will replace at least one character.

    Update: changed the plus to {3,} and the 53 to 35. Thanks madhatter

    -caedes

      I think bobafifi wanted the first 35 characters, the elipsis, and the last 15 characters if the URL was over 53 characters long.

      Update The updated regex works unless the URL is exactly 53 characters long. In that case I believe bobafifi wants the URL to stay the same (although I could be wrong). Here is one that (as far as I can tell) works for any length...

      $url =~ s/^(.{35})(?=.{19}).+(.{15})$/$1...$2/;
        The updated regex works unless the URL is exactly 53 characters long. In that case I believe bobafifi wants the URL to stay the same
        In that case, the ".{3,}" in caedes' code should become ".{4,}". Indeed, I see no reason to replace the correct string of 3 characters, by 3 dots. The code would become:
        $url =~ s/^(.{35}).{4,}(.{15})$/$1...$2/;
        I would be tempted to throw in a "?" to reduce the greediness of the middle subppatern, but it wouldn't help one bit, here.
        Yeah, I'll see about fixing that up.

        -caedes

Re: Short URL?
by The Mad Hatter (Priest) on Apr 06, 2003 at 02:46 UTC
    First time I've done a benchmark, but I think I did it right...
    use Benchmark qw(timethese cmpthese); $url = 'http://www.larryandnatalie.com/isapi.dll/c/content/f/viewprope +rty/siteid/wb5KAM/contentclass/PICT/contentid/ZZZZZZVR/propertyname/O +riginal/~/yamaha_58_page_11_jpg.jpg'; $r = timethese( -5, { 'regex' => sub{$url =~ s/^(.{35})(?=.{19}).+(.{15})$/$1...$2/;}, 'substr' => sub{$url = substr($linktext, 0, 35) . "..." . substr($ +linktext, -15) if length($linktext) > 53;}, }); cmpthese $r; __END__ Benchmark: running regex, substr for at least 5 CPU seconds... regex: 2 wallclock secs ( 5.07 usr + -0.07 sys = 5.00 CPU) @ 63 +3939.80/s (n=3169699) substr: 6 wallclock secs ( 4.69 usr + 0.34 sys = 5.03 CPU) @ 35 +32631.81/s (n=17769138) Rate regex substr regex 633940/s -- -82% substr 3532632/s 457% --
    It would seem as though substr is faster for doing it many times, but the regex for doing it once. Don't quote me on that though, it's quite likely that I've interpreted the results wrong... ;-)
Re: Short URL?
by Cody Pendant (Prior) on Apr 06, 2003 at 03:47 UTC
    Just a couple of usability thoughts.

    Nothing to do with Perl, but when I wrote some code to do roughly this, I added at TITLE attribute to the link, something like TITLE="URL displayed has been shortened for formatting reasons" just so people would get a bit of a hint that they couldn't actually cut and paste or print the URL from display for later use, that they should bookmark, or use the "copy link" function.

    And I put square brackets around the ellipsis to make it more apparent too.

    And I was going to get around to some code that showed you what the actual document at the end of the URL was, if applicable, for instance a long URL ending in .jpg would get truncated to http://blah.com/[...]/flute.jpg and the like. I never did get around to doing that, and anyway, most of the really long ones were caused by query-strings, not huge directory structures.
    --

    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
    M-J D
Re: Short URL?
by jmcnamara (Monsignor) on Apr 06, 2003 at 21:33 UTC

    Another approach might be to use a short URL generator which can turn your long link into something like this: http://tinyurl.com/8y17.

    The WWW::Shorten module provides an interface to several short URL generators.

    --
    John.

    Update: Fixed WWW::Shorten link.