leocharre has asked for the wisdom of the Perl Monks concerning the following question:
i have a web interface file sharing application i'm developing
(the files are on a per user basis,
there can be 2 files that two different users can access,
that is, one file may be accessed by usera, but not by userb)
i keep a table for files- file information-
things like- filepath, inode, creation date, file description, etc.
Here's the kicker, I am using inode as unique row identifier- instead of an auto increment id.
Why?
- inode is already unique by the filesystem.
- people can rename the files via the filesystem- and i have a checkup cron for example, that can make sure the file names match the inodes- and update the db as needed, with new file names for existing entries in table
- a file can be queried for its inode, and that lets us know what record to look up in the table
My questions:
- I understand we must never return sensitive data to the browser- i therefore should be returning a file id number or reference code, instead of an inode number, that sounds sensitive, right?
- users select files to download, etc from a list- therefore the browser client returns to the code- inodes... how sensitive is this, am i doign something really dangerous ?
background
what the main purpose of this app is, is to let specific users download specific files. when a user requests a file via the browser by inode (instead of reference id, etc) - the code checks that a specific permission for that user to that one file exists - if a user without a permission to file x, would be turned down, kicked out, error logged
Re: letting a browser client select a file to download by inode
by Fletch (Bishop) on Dec 27, 2005 at 02:11 UTC
|
Erm, that just strikes me as a bad idea. It's the same reason you don't use a "product number" or similar external identifier as a primary key: it's subject to change outside the database that's just begging to get out of sync. What happens if you move to a different machine (or even just a different filesystem on the same box)? What happens if the drive crashes and things get reloaded from a backup? You've just set yourself up to write more code to deal with these contingencies (which means more development time, more testing (and are you really sure you've covered all the cases?)).
Just seems like you're being overly clever to "save" yourself or your DB from doing a trivial bit of work. If you want something tied to the file itself an MD5 or the like would be a better choice than the inum.
| [reply] [Watch: Dir/Any] |
|
Yes I am being clever.
Sometimes being clever and using bits and pieces of what the community has offered (ext 3 in this case) is what open source is all about- but then sometimes you *are* simply.. being clever for no good reason.
Isn't inode *the* way that ext3 and it's db trusts to keep up with what files are in the system? If it's good enough for my computer, shouldn't it be good enough for everything else ? This is why I thought maybe this was indeed the right way to go; because inode is the way that the machine keeps track of the files. And I kind of trust it more then me.
The environment is one of file-sharing with specific people. People who know little about machines (windows users) will be creating these files to share with even less computer savvy people (more windows users) - on a per person basis.
The people creating the files have power to- through the filesystem; rename the files! There has to be a way to keep track of the file. MD5 had some problems in 2004, some stuff about collisions .. dunno.. not my field.. but.. Is it still safe to check data on MD5 sum? - incredibly interesting suggestion!
| [reply] [Watch: Dir/Any] |
|
Sure its good enough for the file system. Applications however use filenames because those wont change even if you have to delete and recreate files, or reload files from back, or install a new hard drive etc. You could always store the location and the inode, then use your inode link to update names when they change the filename on the system. Or you could give them a web based way to change the name so that you can keep track of it that way.
| [reply] [Watch: Dir/Any] |
|
no, MD5 is not the choice but any better digest function may be it.
I suggest SHA512 (well, you could also use SHA256 - there is also a SHA384 but it's just the same as calculating SHA512 and throwing away the extra bits)
<edit>typo fixed</edit>
| [reply] [Watch: Dir/Any] |
|
Is it still safe to check data on MD5 sum?
For this purpose absolutely. The collision attack that was discovered against MD5 means that some very smart people have managed to create two different bits of data which produce the same MD5 hash. The creation of a modified file the MD5 of which matches that of an existing, "real-world" file is as yet only theoretically possible. And the chance of this happening by accident on your machine is less than that of your server, all backups and your pants spontaneously combusting :-).
A computer is a state machine. Threads are for people who can't program state machines. -- Alan Cox
| [reply] [Watch: Dir/Any] |
Re: letting a browser client select a file to download by inode
by blazar (Canon) on Dec 27, 2005 at 15:29 UTC
|
Others already expressed their perplexities with this idea. Personally, I've never had to do anything like this - but I've been considering the problem of "serving" (in a loose sense) files without "exposing" them. Now, an option that occurred to me is to create a temporary symlink to the actual file with File::Temp, having a separate process removing it asynchronously after a suitable timeout. I'd like to hear more experienced programmers' opinion about this scheme...
| [reply] [Watch: Dir/Any] |
|
blazar, i want to point out that this method only tells the server what file you want- that's all!
Your mention of a symlink is very interesting, it's a thought i had chewed on and i solved in another way thanks to the help of this posting:
at job help
What i had thought of.. was to create a temporary sym link to a file, an at job would delete the sym link in x time..
I was very lucky to get some incredibly useful thoughts on that link up there.. and ended up streaming the thing .. much better. (the original doc resides outside of http accessible realm).
Take a look at the streamer code</a
| [reply] [Watch: Dir/Any] |
Re: letting a browser client select a file to download by inode
by esskar (Deacon) on Dec 27, 2005 at 02:10 UTC
|
why do you think that publishing an i-node will be dangerous? | [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
I want to underscore the following: submitting an inode num to the server is not the only requirement to download a file
There is already a whole method of identifying the user by ip, auth, session time, etc etc- it does happen in ssl. The validity of your 'pageview' is checked with every action etc.. If one were to try to view or download a file they cannot, they are pooped out.
So, first an inode is submitted. Then the inode *must* be in my valid 'files' table which does *not* record about anything but .doc, .pdf, etc like "document" files.
Second, there is a "files to users" (normalisation) table that simply establishes a relationship between a user and a file. To download a file, you must have an entry in the files to users table. If not, the app freaks the hell out, ends your session, sends a notice for the admin to view the logs.
| [reply] [Watch: Dir/Any] |
Re: letting a browser client select a file to download by inode
by superfrink (Curate) on Dec 28, 2005 at 00:00 UTC
|
inode is already unique by the filesystem.
This is not true. Under many unix filesystems the inode (Index NODE) number is unique within a partition.
This is important because multiple filesystems (each on it's own partition) can be mounted by a unix system at the same time at different directories. This means your web server's /usr and /var can both have inode number 4615 for one file below it's mount point.
Some filesystems do not use inodes. Eg FAT and FAT-32. When Linux (used beause I'm familiar with it) mounts a fat32 filesystem it assigns inode numbers as the directories and files are accessed.
This means if you unmount and remount the filesystem you are quite likely to get different inode numbers assigned to the same file. UPDATE: Just to be clear this referrers to filesystems that do not use inodes.
Personally I would be tempted to use a database "sequence" since I would be storing account info, etc in a database anyway. In MySQL you can use an "auto increment" field.
If you are not using a database you can still keep track of a sequence number in a file but you have to be sure your scripts lock the file so you never reuse a sequence number. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
Yes indeed. Mounting and umount ing.. Great hell fun that was. A lot of the junk we serve will possibly be on .. guess what.. ntfs.. yeah.
I want to support this kind of activity (renaming via fs and keeping db reliable) - but i just see no way. with reg files, i use md5sums.. and that's a charm. But with directories ? oh boy..
if a mounted section of the data being served is umount ed, then changes happen, dirs get renamed.. and it gets mounted again a day later.. i'm screwed!
| [reply] [Watch: Dir/Any] |
|
|