My cousin has the habbit of naming files in the most peculiar way. He uses unicode arabic character in the filenames, and his laptop runs an arabic version of Windows98.
The other day he came to me for a backup because his computer won't boot. So I booted it with a Virtual Linux CD, and transfered his files on the network.
The arabic files were a problem and I tared them together, but when the time came to extract the files, there were all named things like ????? ??? ????.doc, which were not allowed characters, and the tar couldn't be extracted.
digging around for a while, I hacked this script together to recover files which contain illegal filenames.
the script takes the Tar archive filename as an argument and generates the directory structure and files in the current directory substituting illegal names.
This is still in hackish format, I just added the bits of comments before posting it.
#!perl use Archive::Tar; my $t = Archive::Tar->new(); $t->read($ARGV[0]) or die "Must specify valid input file - $!\n"; # this regex can be changed to fit the platform specific # illegal characters to watch for. my $BAD_CHARACTERS = '[?]'; my %sig; my $i; foreach my $h ($t->data()) { # typeflag 5 is a folder. if ($h->{typeflag} == 5) { my @folders = split '/', $h->{name}; my @newf = (); foreach (@folders) { my $new; if (/$BAD_CHARACTERS/) { if ($sig{$_}) { # this happens if we step on a previously # recovered folder. $new = $sig{$_}; } else { $new = "_recovered_folder" . ++$i; $sig{$_} = $new; } } else { $new = $_; } push @newf, $new; } $sig{$h->{name}} = join '/', @newf; foreach (@newf) { mkdir $_; # actually there would be a lot of error +s here # had it been possible, # an mkdir -p $_ would have been better. chdir $_; } $out = '../' x @newf; chdir $out or die "I must have overdone myself. $!\n"; } else { my $name = $h->{name}; my ($nm) = (split '/', $name)[-1]; $new = $nm; if ($nm =~ /$BAD_CHARACTERS/) { my ($ext) = $nm =~ /\.(.*)$/; # pickup extensi +on. $new = "_recovered_file" . ++$i . ".$ext"; } my $folder = $name; $folder =~ s/\Q$nm\E//; $folder = $sig{$folder}; my @times = split '/', $folder; chdir $folder or die "where did that '$folder' go? $!\ +n"; open OUT, ">$new" or die "$!\n"; binmode (OUT); print OUT $h->{data}; close OUT; $myout = '../' x @times; chdir $myout or die "again? $!\n"; } }
|
|---|