in reply to Regex to Truncate URLs Nicely

Since I would now guess this is an "academic" endevour. I will give you the way I would approach it, without modules. First I would use two regexes. The first to reduce something like:

http://some-shop.com/dir1/dir2/buystuff.cgi?x=1&y=2&z=3

to something like

http://some-shop.com/(...)/buystuff.cgi?x=1&y=2&z=3

and the second regex to remove anything at the end if there is a long query string at the end. But only doing anything if the URL is over 50 chars.(then again I might just use a couple of splits and some concatenation magic instead, but that would depend on what all my data looked like.)

Good Luck.

-enlil

Replies are listed 'Best First'.
Re^2: Regex to Truncate URLs Nicely
by Aristotle (Chancellor) on Nov 02, 2002 at 07:48 UTC
    I'd do it the other way around. The query parameters may contain slashes, but the path cannot contain question marks. If you try to reduce directories first, you will have to resolve the ambiguity of slashes in the path vs slashes in the query parameters. If you remove the query parameters first, for which there is an unambiguous criterion, then the slashes suddenly are unambiguous too.

    Makeshifts last the longest.