John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

In my previously-posted Work Backup program, the script copies changed files to another location. One use is for "backup buddies" on a network (or with DSL etc.), so I'm adding features to make the files encrypted. I've got that part well in hand.

However, what about the file name? I'd like that to be safe from prying eyes (even accidently), too.

Now I could maintain a dictionary and simply assign meaningless names like file123456.dat to the destination. But I want to do without a central list, and make each file individually 1:1 reversable. That is, encrypt the file name and be able to decrypt it.

But, a regular encrypt will produce byte values that are in a larger range than the input. File names may be 94 ASCII characters, for example, but the output will be 0..255 fairly uniformly. That won't work for a filename!

A simple idea is to then BASE64 encode the results. That should not increase the length =too= badly. But I wonder if there may be a more elegant way, such as encrypting using a stream cypher that maps 94 input chars to the same 94 output chars. Does anybody know of such a thing?

—John

Replies are listed 'Best First'.
(tye)Re: Encrypting a Filename
by tye (Sage) on Jun 15, 2001 at 02:31 UTC

    I'd probably do something like in (tye)Re: Portably transforming a string to a valid filename to transform each file name into a string that only uses 64 common characters ('0'..'9','a'..'z','A'..'Z','.', and '_'), treat that like a base-64 encoded value and unencode it to get a binary string, encrypt that string, base-64 encode that and use the result as the file name. (Note that I said "like" as that node creates file names taken from a set of 65 characters.)

    The file name will remain the same size (if it doesn't use any unusual characters) or increase slightly in size (if it does) and you can use any general-purpose encryption you like.

            - tye (but my friends call me "Tye")
      treat that like a base-64 encoded value and unencode it to get a binary string, encrypt that

      I like that! Thanks. It's inspiring comments like that that make me glad I posted to the group.

      —John

Re: Encrypting a Filename
by btrott (Parson) on Jun 15, 2001 at 02:19 UTC
    Shameless plug, yes, but how about using Net::SFTP? SFTP runs as a subsystem over a standard SSH connection, so automatically everything you send over the network is encrypted, including both filename and file contents.

    This way you can just do something like this:

    use Net::SFTP; my $sftp = Net::SFTP->new("backup_host"); $sftp->put("local_file", "remote_file");
    Or, what about rsync, using ssh as the rsh replacement? I don't know enough about that to know whether it encrypts filenames going over the network, but I would *guess* that it does, because I believe it just runs over ssh.

    If you want to stick w/ the method you're talking about, I think your best option is to do what you suggested: encrypt the filename, then base64-encode it. Considering that you're going to be sending the entire contents of files over the network, adding a couple of bytes per filename isn't going to be the deciding factor in terms of efficiency. :)

      Re SSH: Since I don't have that, I'm really looking for a self-contained solution. This script backs up to another part of the file system, which can be a different spindle or could be a "backup buddy" over the network, but it doesn't require a network. I will check it out, though.

      As for efficiency, that's not my worry. I'm more concerned with file name limitations of the receiving system. Burn to a standard CDR for example, limits the length and character set. I can punt on that if I just say "ZIP the files, then burn that" because ZIP doesn't have a length limit and pretty much likes any 8-bit character string as a name. However, what if your source is case-sensitive and the dest is case sensitive? In my situation at home, the source is NTFS with full Unicode names, but the dest is FAT32.

      So, I'm thinking of URL-encoding the names and sticking to a small destination character repertoire. Preserves case. BUT, what about case clashing if the dest is case-insensitive? I suppose I need to encode that too, and assume the dest is case smashing!!

      Any thoughts, on any of that? Don't you hate contradictory requirements?

      —John

        I'm more concerned with file name limitations of the receiving system.

        One more suggestion, then: calculate the encrypted file name with some kind of digesting algorithm (with numeric suffix to make it unique), while encrypting the original name together with file contents (by prepending or appending it to the original file)?

        Pro:
        you can make the filename generator use very small character set.

        Cons:
        1) file name decoding requires more work (and the encrypted version is not sufficient),
        2) finding encrypted file might require proximity searching,


        -mk
Re: Encrypting a Filename
by marcink (Monk) on Jun 15, 2001 at 01:59 UTC
    How about using something like <shamelessplug>this</shamelessplug>? ;-) It's very simple, but seems to do exactly what you need -- (en|de)cryption (after simple modification -- that implementation squashes all letters to lowercase) using alphabet permutation. The nice thing is that the result has the same length as before encryption.

    -mk
      I saw that, and supposed at the time it was a Perl implementation of the Vigenère cypher, probably because your chart was formatted as a square. Looking again now, it seems to be just a simple 8-way Ceasar cypher, which I guess is the same in principle with a 8-character password built-in rather than allowing the user to supply a password.

      It will be trivially cracked. You can suppose that files end in ".exe" etc., and very often have dot as the 4th from the end character. Since the encoding rule depends on the mod-8 position, various length filenames will allow you to probe the cypher and get a few matches. From there you can pick out bits of words, and continue the process.

      Hmm, I wonder if the old Vigenère or other substitution cypher could be used in a chaining mode...?

      —John

        Well, I _did_ write it was a simple solution, didn't I? ;-)

        I like tye's suggestion, although increasing file name might mean exceeding filesystem's name length limit... Hmm... perhaps good ol' enigma?

        -mk
Re: Encrypting a Filename
by petdance (Parson) on Jun 15, 2001 at 08:09 UTC
    If you've got the file contents hidden successfully, and you just want to obscure the file name as it passes over the network, how about making a .tar file with an arbitrarily meaningless filename? Then, you don't have to decrypt the filename, but rather just untar it. The tar is just a meaningless container/shroud.

    xoxo,
    Andy

    %_=split/;/,".;;n;u;e;ot;t;her;c; ".   #   Andy Lester
    'Perl ;@; a;a;j;m;er;y;t;p;n;d;s;o;'.  #   http://petdance.com
    "hack";print map delete$_{$_},split//,q<   andy@petdance.com   >
    
      Because the backup process compares current files against the backup, and only copies newer files. Given changed file X, it needs to know to look at remote file Y to see if it's older.
        So take note of the timestamp on secretfile.dat, tar it up, and then set the timestampe on the tar to that timestamp.

        If your entire tree is done with this tar mechanism, then the backup process should work fine.

        xoxo,
        Andy

        %_=split/;/,".;;n;u;e;ot;t;her;c; ".   #   Andy Lester
        'Perl ;@; a;a;j;m;er;y;t;p;n;d;s;o;'.  #   http://petdance.com
        "hack";print map delete$_{$_},split//,q<   andy@petdance.com   >