hacker has asked for the wisdom of the Perl Monks concerning the following question:
I have a need to compress and encode a string of characters, in a way that will be easy to reverse later. I'm not trying to obfuscate the string, just make it easier to manipulate later. Turning the string into a series of numerals would be ideal, but alphanumeric would also work.
The string I'm dealing with is a series of param() values coming back from an HTML form. One example of the string looks like this:
/h/plkr/3/www.plkr.org/rss.pl
This breaks down into:
/h = Scheme (http in this case) /plkr = Format (Output format) /3 = Fetch limit /www.plkr.org/rss.pl = Feed url
I tried IO::Zlib, Compress::Zlib, Digest::MD4 and Digest::MD5, and others... in the hopes that I could compress the string, then encode it, but it still gives me an ascii string that is longer than the original input (not enough redundant characters to make compression worthwhile).
Is there a way to do something like this? The shortest I can get the encoded string is 45 characters, with a 30-character input string, using this code:
my $string = '/h/plkr/3/www.plkr.org/rss.pl'; my ($type, $format, $limit, $feed) = (split '/', $string, 5)[1..4]; my @tokens = split('/', $string); my $compressed = compress($string) ; my $encoded = encode_base64($compressed);
The reason why I'm trying to do this, is so that I can present this url as a value in a URL passed into my application later on, such as: index.pl?eJzTz9AvyMku0jfWLy8v1wMx9fKL0vWLiouBHACWdQpk
I'm trying to make it easier for the user to use and bookmark these URLs for later use with my application. I'm storing the unique URL in a database, but I can't store the depth and output format in the db, because hundreds of users could use the same feed url, but apply different depths or output formats to it, and these are random/anonymous users, so I can't store this in a user table in the db.
I could build a set of hashes that have numeric lookups for output format and scheme, but that doesn't really gain me much in terms of making the value sitting in the URI field any smaller.
Any useful hints or tips on how I can optimize this further?
|
|---|