ikegami has asked for the wisdom of the Perl Monks concerning the following question:

According to perlport's note on stat, the device and inode information returned by stat are not meaningful or reliable on some systems. Is there a way of knowing if they are meaningful on the current system? Perhaps a Config variable contains this information?

Update: Background info:

I'm trying to determine if a given file has already been read and processed. If it has, I'll just use the data in memory instead of processing the file a second time. False negatives are acceptable. False positives are not.

sub canonize_file_name { ??? } my $id = canonize_file_name($file_name); if (not exists $forest{$id}) { $forest{$id} = Tree->new($file_name); } return $forest{$id};

I plan to use File::Spec's rel2abs and case_tolerant, but I thought I might use the device+inode on systems where it is supported. Is it possible to determine if this is such a system? Is there an existing module that does any of this?

Update: Progress:

Cache::AgainstFile looks good. Still need to handle canonisation.

Replies are listed 'Best First'.
Re: Is device+inode meaningful?
by gaal (Parson) on Dec 30, 2006 at 08:12 UTC
    I recently heard that Ubuntu gives volumes UUIDs as names, presumably to make this sort of thing possible. But it's not something I would bet on: what happens when you use removable media, or for example when you take a disk of one machine, dd it, insert a copy into another?

      I would keep track of mtime and size to make sure the file hasn't changed since it was processed. I can accept a false positive in the case of where the user has two files with identical paths, mtime and size on different media and he switches the media.

      ( I'm not worried about removable media anyway. The problem is not identifying volumes, but rather determining that "./file" and "file" are the same file. )

        Cwd::abs_path could help you...
Re: Is device+inode meaningful?
by ysth (Canon) on Dec 31, 2006 at 03:35 UTC
    I don't think you'll find some flag that indicates whether they are reliable or not, but I had the impression that various pieces of GNU coreutils have conditional code to use them to tell when two files are in fact the same file; you might look to see under what conditions it trusts them.

    Update: there is nothing helpful in the coreutils source, sorry.

Re: Is device+inode meaningful?
by aufflick (Deacon) on Jan 02, 2007 at 01:29 UTC
    Cache::AgainstFile looks like an absolute monty. Very useful find.

    Good old Aunty :)