Can you use a number for the subject code and the athor code? If you used the following packing:

Year 2 bytes (0..65535) subject code 2 bytes(?) (0..65535) author code 2 bytes(?) (0..65535) chemicals: 2 bytes per (0..65535)

That allows you to have up to 11 chemicals. It's possible to do better (by using bits which aren't an even number of bytes), but you lose efficiency.

If the above is ok, let me know and I'll code it tonight.

If the above is not sufficient, let me know more precise ranges for the year, the subject codes, the author codes and chemical codes.

Update: Promised code:

sub encode_base32 { ... based on encode_base64 ... } sub decode_base32 { ... based on decode_base64 ... } sub compress_data { my ($year, $subject, $author, @chemicals) = @_; carp(...) if $year < 0 || $year > 65535; carp(...) if $subject < 0 || $subject > 65535; carp(...) if $author < 0 || $author > 65535; carp(...) if @chemicals > 11; carp(...) if grep { $_ == 65535 } @chemicals; push(@chemicals, (65535)x(11-@chemicals)); return pack('n*', $year, $subject, $author, @chemicals); } sub decompress_data { my ($data) = @_; my ($year, $subject, $author, @chemicals) = unpack('n*', $data); @chemicals = grep { $_ != 65535 } @chemicals; return ($year, $subject, $author, @chemicals); } $file_name = encode_base32(compress_data(...)); (...) = decompress_data(decode_base32($file_name));

Untested.


In reply to Re^7: URL string compression? by ikegami
in thread URL string compression? by punch_card_don

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.