Joel has asked for the wisdom of the Perl Monks concerning the following question:

I just cloned a drive and wanted to verify that all of the files got copied. What a great way to teach myself Perl. I've searched the net and can't find an explanation for what I'm seeing. Can anyone help? Here is the code (reduced to the basic elements):
use strict; use warnings; use File::Find; find(\&process, 'F:'); sub process() { my $f = uc($File::Find::name); if ( -f $f) { my $c = $f; $c =~ s/^F:/C:/; if (-e $c) { if (-s $f != -s $c) { print "unequal size: $f\n"; } } else { print "missing in destination: $c\n"; } } }
Here is some of the output (out of hundreds of errors):
Can't opendir(F:Users/Joel/Documents/My Music): Invalid argument at f1 +.pm line 6 Can't opendir(F:Users/Joel/Documents/My Pictures): Invalid argument at + f1.pm line 6 Can't opendir(F:Users/Joel/Documents/My Videos): Invalid argument at f +1.pm line 6
Here are the results of a DIR command
C:\Users\Joel>dir Volume in drive C has no label. Volume Serial Number is DE8B-53F2 Directory of C:\Users\Joel 11/29/2010 01:51 PM <DIR> . 11/29/2010 01:51 PM <DIR> .. 02/07/2011 03:05 PM <DIR> Contacts 02/11/2011 01:11 PM <DIR> Desktop 02/07/2011 03:36 PM <DIR> Documents 02/11/2011 01:11 PM <DIR> Downloads 08/07/2010 02:16 AM <DIR> Favorites 08/07/2010 02:16 AM <DIR> Links 12/24/2010 07:21 PM <DIR> Music 02/07/2011 10:28 AM <DIR> Pictures 08/07/2010 02:16 AM <DIR> Saved Games 08/07/2010 02:16 AM <DIR> Searches 12/12/2009 05:08 PM 0 Sti_Trace.log 09/07/2010 06:58 AM <DIR> Videos 1 File(s) 0 bytes 13 Dir(s) 280,595,611,648 bytes free
Notice that there is no "My " anything, why is File::Find acting like there is? Even stranger I can CD into "My Documents" but then it doesn't exist (even though I'm already there).
C:\Users\Joel>cd "my documents" C:\Users\Joel\My Documents>dir Volume in drive C has no label. Volume Serial Number is DE8B-53F2 Directory of C:\Users\Joel\My Documents File Not Found C:\Users\Joel\My Documents>
Is there a way to get Find to ignore these bogus directories? And can anyone tell me what they are?

Ok, I just figured out that they are JUNCTIONS, now how can I get find to ignore them?

Replies are listed 'Best First'.
Re: File::Find giving unexpected results under windows
by BrowserUk (Patriarch) on Feb 15, 2011 at 00:27 UTC

    Get yourself a copy of SysInternals Junction.exe to see what is going on.

    Here's a typical user directory seen vis dir and then junction:

    C:\Users\postgres>dir Volume in drive C has no label. Volume Serial Number is 8C78-4B42 Directory of C:\Users\postgres 03/09/2009 11:20 <DIR> . 03/09/2009 11:20 <DIR> .. 26/12/2010 07:31 <DIR> Desktop 24/03/2009 06:46 <DIR> Documents 02/11/2006 12:34 <DIR> Downloads 02/11/2006 12:34 <DIR> Favorites 02/11/2006 12:34 <DIR> Links 02/11/2006 12:34 <DIR> Music 02/11/2006 12:34 <DIR> Pictures 02/11/2006 12:34 <DIR> Saved Games 02/11/2006 12:34 <DIR> Videos 0 File(s) 0 bytes 11 Dir(s) 300,875,481,088 bytes free C:\Users\postgres>junction * Junction v1.05 - Windows junction creator and reparse point viewer Copyright (C) 2000-2007 Mark Russinovich Systems Internals - http://www.sysinternals.com \\?\C:\Users\postgres\Application Data: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming Substitute Name: C:\Users\postgres\AppData\Roaming \\?\C:\Users\postgres\Cookies: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Cookies Substitute Name: C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Cookies \\?\C:\Users\postgres\Local Settings: JUNCTION Print Name : C:\Users\postgres\AppData\Local Substitute Name: C:\Users\postgres\AppData\Local \\?\C:\Users\postgres\My Documents: JUNCTION Print Name : C:\Users\postgres\Documents Substitute Name: C:\Users\postgres\Documents \\?\C:\Users\postgres\NetHood: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Network Shortcuts Substitute Name: C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Network Shortcuts \\?\C:\Users\postgres\PrintHood: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Printer Shortcuts Substitute Name: C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Printer Shortcuts \\?\C:\Users\postgres\Recent: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Recent Substitute Name: C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Recent \\?\C:\Users\postgres\SendTo: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\SendTo Substitute Name: C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\SendTo \\?\C:\Users\postgres\Start Menu: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Start Menu Substitute Name: C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Start Menu \\?\C:\Users\postgres\Templates: JUNCTION Print Name : C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Templates Substitute Name: C:\Users\postgres\AppData\Roaming\Microsoft\Window +s\Templates C:\Users\postgres>

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Is there any way to control how File::Find sees this?

        Sorry. I don't use File::Find, so I don't know for sure, but I doubt it.

Re: File::Find giving unexpected results under windows
by cdarke (Prior) on Feb 15, 2011 at 08:39 UTC
    How are you creating the Junction? Are you using the sysinternals tool? http://technet.microsoft.com/en-us/sysinternals/bb896768
    It seems to work for me. You mentioned "cloning" at first, what method are you using to create your clone?
    A slightly more tedious alternative to File::Find is opendir/readdir/closedir.

    Not releated to your question, but be careful of using :
    sub process()
    The parentheses are a prototype and mean that no arguments should be passed. That prototype is ignored here because the subroutine is defined after the call, and the call is through a reference, so prototypes are ignored. If you check the File::Find documentation and the example "wanted" subroutine it is not declared with a prototype.
      Thanks for the tip on the prototype - perl is new to me. The ()'s got added out of habit.

      I'm not creating the junctions, windows did. The way I found out what they were was a DIR in the correct directory - someone pointed out in the chatterbox that I had done my initial DIR in the wrong directory, so here is the right one

      C:\Users\Joel\Documents>dir /ad Volume in drive C has no label. Volume Serial Number is DE8B-53F2 Directory of C:\Users\Joel\Documents 02/07/2011 03:36 PM <DIR> . 02/07/2011 03:36 PM <DIR> .. 09/14/2010 05:56 AM <DIR> Archive 12/24/2010 07:24 PM <DIR> Misc 12/11/2009 06:42 AM <JUNCTION> My Music [C:\Users\Joel\Music] 12/11/2009 06:42 AM <JUNCTION> My Pictures [C:\Users\Joel\Pict +ures] 12/11/2009 06:42 AM <JUNCTION> My Videos [C:\Users\Joel\Videos +] 09/04/2010 05:52 AM <DIR> WORD 0 File(s) 0 bytes 25 Dir(s) 280,587,014,144 bytes free C:\Users\Joel\Documents>
      As for the clone, I had a WD drive going bad, got a new one under warranty (WD treated me right), and used their software "Acronis True Image" to clone it which copies sector by sector. I did have a number of unreadable sectors but am now booting off the new drive. It is difficult to know which files are on a particular sector and the clone log didn't list enough info anyway so I just wanted to see if there were any missing files before erasing the old drive and shipping it back.

      I just ran this code

      my $x = 'c:/users/joel/documents'; opendir(my $dh, $x) || die; while(readdir $dh) { if (-d "$x/$_") { print "dir $_\n";} if (-l "$x/$_") { print "lnk $_\n";} } closedir $dh;
      and get this:
      dir . dir .. dir Archive dir Misc dir My Music dir My Pictures dir My Videos dir WORD
      from above see that the "My x" are junctions (or links?) Shouldn't they get caught by the -d/-l test?

        You ought to be able to use Win32::Symlink to disambiguate junctions, but it fails to build as is, and if I 'fix' it, it fails its own tests.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: File::Find giving unexpected results under windows
by Jim (Curate) on Feb 14, 2011 at 23:34 UTC

    I just searched the File::Find documentation for the word "junction" and found nothing. Under the two functions that relate to symbolic links, the File::Find documentation states "This is a no-op on Win32." For these reasons, I suspect File::Find isn't programmed to handle Windows file systems correctly. (Windows 7 makes use of junction points in a way that previous versions of Windows did not.)

    I suspect you won't be able to use Perl's core File::Find module for your Windows file system work. It also doesn't handle Unicode file names correctly on Windows, either. (Perl doesn't, so File::Find can't.)

    I'm sorry the news isn't better.

      I've also been doing some searching and a junction appears to be a symbolic link. I did find that by using  no warnings 'File::Find'; I can suppress the messages (but I'm not really sure if the links are being followed or not). Thanks
Re: File::Find giving unexpected results under windows
by ikegami (Patriarch) on Feb 15, 2011 at 18:40 UTC

    Sounds like a bug in its handling of drive-prefixed relative paths. Keep in mind that "«F:Users/Joel/Documents/My Music»" means "«Users/Joel/Documents/My Music» relative to the F:'s current directory".

    Telling File::Find not to chdir (which is not a bad idea anyway) and/or switching to find(\&process, 'F:\\'); (which is probably what you meant anyway) should fix it.

      hi,

      when i used File::Find to search a whole drive under windows sometimes my script died with e.g.

      Can't cd to ../../../../../../../../../../../../.. from C:/Dokumente und Einstellungen/User/Anwendungsdaten/Macromedia/Flash Player/#SharedObjects/3V6TWZWS/static.sf-cdn.com/MD5=02b6557eb0995f2859ece9db3a0ce8fb/default/swf/v4_0/platform/bin/com/snapfish/modules/controllers/upload/UploadController.swf at D:/usr/lib/File/Find.pm line 983, <STDIN> line 1.

      i've learned that this is due to the Windows MAX_PATH length of 260 chars and the chdir() implementation (which seems to append all the ../../../../../.... to the actual path).

      a simple solution shifts that problem almost out of the way (up to a max. pathlength of 258 chars in windows, does not interfere with other os)

      insert in Find.pm at line 983:

      chdir ('..') or die "Can't cd to .. from $dir_name" while $tmp =~ s/^\.\.\///o;

      then that code should (nice) look like

      $tmp = join('/',('..') x ($CdLvl-$Level)); } chdir ('..') or die "Can't cd to .. from $dir_name" while $tmp =~ s/^\.\.\///o; die "Can't cd to $tmp from $dir_name" unless chdir ($tmp);
Re: File::Find giving unexpected results under windows
by Anonymous Monk on Feb 15, 2011 at 12:10 UTC
    One thing came to mind: find(\&process, 'F:') and find(\&process, 'F:/') gives very different results (at least for me). Not sure, but I think it might relate to your problem.
      I get the same results either way. Are you running under windows? find(\&process, 'F:/');
        Windows, yes.
        find( sub { print $File::Find::name, "\n"; }, 'C:');
        returns only files and directories in my home Documents dir, like
        C:My Music
        wheras with the slash I get all the contents of the C: drive, and the correspoding path looks like
        C:/Documents and Settings/micgra/My Documents/My Music