in reply to Re^5: Using filepath method to identify an .html page
in thread Using filepath method to identify an .html page

This node falls below the community's threshold of quality. You may see it by logging in.
  • Comment on Re^6: Using filepath method to identify an .html page

Replies are listed 'Best First'.
Re^7: Using filepath method to identify an .html page
by Your Mother (Archbishop) on Jan 22, 2013 at 16:50 UTC

    Maybe there is confusion about how you really want to go about this because the spec as given is ludicrous. You have four digits to work with which means 10,000 numbers available. Assuming only lowercase letters for example (like abc.html)–

    perl -le 'print 24**3' 13824

    You're already out of room with this most trivial example. Any real file names/paths will certainly not be able to fit in any translation scheme. The information must be *somewhere*. Your quest to save space by disappearing it seems magical. Pretty much everyone here is telling it to you straight.

    If there is still a misunderstanding, perhaps you could give a concrete example of input (and its range) and output you expect. If what you want is possible someone will help you.

      The third of Clarke's three laws is at play here:

      Any sufficiently advanced technology is indistinguishable from magic.

      "sufficiently advanced" is really a relative term. To one who understands it, it is not sufficiently advanced to be indistinguishable from magic. But to one who doesn't understand it, the line between reality and magic is obscured.

      Once technology becomes indistinguishable from magic, it becomes impossible to distinguish between what is possible and what is impossible. With magic everything should be possible, right? What we need to do is shed our understanding so that the technology we're discussing appears to us as magic as it does to someone who believes that through the magic of technology everything must be possible. Only then will we be able to come up with solutions based on the boundless nature magic rather than the finite constraints of well-understood technology.


        I have always found that bit of Clarke silly. "Sufficiently advanced" is not relative, it's self-fulfilling. "Sufficiently A is B" is always true because it's what the clause means.

Re^7: Using filepath method to identify an .html page
by Anonymous Monk on Jan 22, 2013 at 16:31 UTC

    So, what you want is a function foo() and its inverse foo'() that does this:

    foo( "some long string" )  -->  1234
    foo'( 1234 ) --> "some long string"


    In other words, you want to keep the entire information content of the original string.

    There are only two ways to do this:

    1. keep the original string, i.e. store it in a database of some sort, or
    2. lossless compression of the original string into a short number. While theoretically possible, the compression ratio you're asking for isn't going to be possible.

    Which leaves the first option, as plenty of people here have described.

    So: You've said you already have a database with a column that stores a 4-digit number. If that database table doesn't already have a column that stores the HTML page's absolute path, then add one. You'll also want a UNIQUE constraint on the column with the number (and/or the other column, depending on your database design). The rest is "just" SQL...

Re^7: Using filepath method to identify an .html page
by Plankton (Vicar) on Jan 23, 2013 at 05:51 UTC
    how can you turn the "path" into a 4 digit number? that makes no sense. you maybe could generate a checksum on each path, but checksums are more than 4 digits long. unless you use a large base. like base 1000 instead of base 10, but then you'd have to invent a bunch of characters.