Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

File Name Pattern

by LostS (Friar)
on Dec 13, 2002 at 16:37 UTC ( [id://219630]=perlquestion: print w/replies, xml ) Need Help??

LostS has asked for the wisdom of the Perl Monks concerning the following question:

Today's Issues is:

OK I have a directory full of files. They should all be *.jpg's and be in number format. Here is what I need to do.

Look at file name. If file name is not 6 digits (exp: 123456.jpg) add 0 to the front until it is (exp: so 1234.jpg becomes 001234.jpg). Then I need to get those first 3 digits. OK then I need to create a directory by those first 3 digits if it does not already exist. Then move the file with it's new name to that directory. It needs to do this with every file in this directory (approx 42,000 images).

OK I guess what I need help with is trying to figure out the regex junk (never been my strong point). I read in the directory how do I do the the match to see if the file is 3 or 4 or 5 or 6 digits long and how would I add the characters to the front of the files and get those first 3 digits??


-----------------------
Billy S.
Slinar Hardtail - Hand of Dane
Datal Ephialtes - Guildless
RallosZek.Net Admin/WebMaster

perl -e '$cat = "cat"; if ($cat =~ /\143\x61\x74/) { print "Its a cat! +\n"; } else { print "Thats a dog\n"; } print "\n";'

Replies are listed 'Best First'.
(jeffa) Re: File Name Pattern
by jeffa (Bishop) on Dec 13, 2002 at 16:44 UTC
    "how do I do the the match to see if the file is 3 or 4 or 5 or 6 digits long and how would I add the characters to the front of the files and get those first 3 digits??"

    use printf or sprintf:

    printf("%06d\n",$_) for (123, 1234, 12345, 123456); my $val = sprintf("%06d",123);
    To get the first three digits:
    my $name = '012345'; my ($three) = $name =~ /^(\d\d\d)/;
    You might want to look at File::MMagic if you have to ensure that the files really are jpg's.

    UPDATE:
    more code for you :)

    use strict; while (<DATA>) { chomp; my ($numbs) = $_ =~ /^(\d+)/; my $formatted = sprintf("%06d",$numbs) . '.jpg'; my ($dir) = $formatted =~ /^(\d\d\d)/; print "filename: $formatted\t\tdirname: $dir\n"; } __DATA__ 123.jpg 1234.jpg 12345.jpg 123456.jpg

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
      You might also want to make sure that all the filenames are in the format you think they are:
      warn "Invalid file $file" unless $file =~ /^\d{3,6}\.jpe?g$/; This will complain unless the filename consists of 3-6 digits, followed by a period and the suffix "jpg" or "jpeg". You might want to make a pass before you start moving files, just to make sure your directories contain what you think they contain.

Re: File Name Pattern
by MarkM (Curate) on Dec 13, 2002 at 17:07 UTC

    Requirement 1:

    Find filenames within a directory that look like '\A\d+\.jpg\z'. (I prefer \A and \z over ^ and $ as \A and \z match 'beginning of string' and 'end of string' as opposed to 'beginning of line' and 'end of line')

    my(@filenames, $directory); opendir($directory, $directory_path) or die "opendir of $directory_path failed: $!\n"; @filenames = grep /\A\d+\.jpg\z/, readdir($directory); closedir($directory);

    Requirement 2:

    Extend filenames such that they follow the form '\A\d{6,}\z'.

    $filename = sprintf("%06d.jpg", $1) if $filename =~ /\A(\d+)\.jpg\z/;

    Requirement 3:

    Creating directory (if it does not already exist) and moving files into the directory.

    use File::Spec; -d $subdir_path or mkdir($subdir_path, 0777) or die "mkdir of $subdir_path failed: $!\n"; rename($filename, File::Spec->catfile($subdir_path, $filename)) or die "rename of $filename failed: $!\n";

    Requirement 4:

    Matching the first three (or more?) digits in the string.

    $filename =~ /\A(\d*)\d{3}\.jpg\z/; # result stored in $1

    Final Solution:

    Putting this all together, we get:

    use File::Spec; for my $directory_path (@ARGV) { my $directory; my @filenames; # Determine @filenames to work with in $directory_path. opendir($directory, $directory_path) or die "opendir of $directory_path failed: $!\n"; @filenames = grep /\A\d+\.jpg\z/, readdir($directory); closedir($directory); for (@filenames) { my($basename, $prefix) = /\A((.+).{3})\.jpg\z/s or next; my $subdir_path = File::Spec->catdir($directory_path, $prefix) +; # Create the subdir if it does not yet exist. -d $subdir_path or mkdir($subdir_path, 0777) or die "mkdir of $subdir_path failed: $!\n"; my $old_path = File::Spec->catfile($directory_path, $_); my $new_path = File::Spec->catfile($subdir_path, sprintf("%06d +.jpg", $basename)); # Move the file to the subdir. rename($old_path, $new_path) or die "rename of $old_path to $new_path failed: $!\n"; } }

    Edit to by tye to remove PRE tags around CODE tags

Re: File Name Pattern
by lemming (Priest) on Dec 13, 2002 at 16:56 UTC

    You'll probably want to use something like grep /\.jpg/, readdir(DIR) to get your list of files. You might even want to sort them, as you go through, split off the back extension, and check the prefix.

    You'll want to check to make sure your prefix is just made of digits. If not, figure out what you want to do with it. If so, sprintf("%06d", $var) for your new prefix and look into substr for getting the first three chars of that.

    Then go ahead and make your sub-directories as needed, and make sure you already don't have the numbered file before you rename it. Since your original dir may have a 001234.jpg and 01234.jpg in it.

Re: File Name Pattern
by jdporter (Paladin) on Dec 13, 2002 at 16:57 UTC
    Here's one way.
    opendir D, "."; my @old_files = grep { /^\d+\.jpg$/ } # keep only those of interest readdir D; closedir D; my %dirs; my %renames; @renames{ @old_files } = # insert slash; remember dir: map { s#(...)#$1/#; $dirs{$1}++; $_ } # pad with zeros: map { s/(\d+)/ sprintf "%06d", $1 /e; $_ } @old_files; for my $dir ( keys %dirs ) { mkdir $dir; -d $dir or die "Error: failed to assert director $dir"; } while ( my( $old, $new ) = each %renames ) { rename $old, $new or warn "Failed to rename $old as $new: $!\n"; }

    jdporter
    ...porque es dificil estar guapo y blanco.

Re: File Name Pattern
by pg (Canon) on Dec 13, 2002 at 17:05 UTC
    try those two:
    s/(\d*)/sprintf("%06d",$1)/e; m/^(\d{3})/
Re: File Name Pattern
by CountZero (Bishop) on Dec 13, 2002 at 17:01 UTC

    To get the first three digits, you could also use the substr-function (substring). It is probably faster than using a regex.</p

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: File Name Pattern
by dingus (Friar) on Dec 14, 2002 at 17:27 UTC
    use File::Dosglob; $dir = '/path/to/dir'; my @m = File::DosGlob::doglob(1,$dir.'/*'); # trailing /* is vital! for (@m) { next unless m!/(\d*)(\d{3}\.jpg)$!; # skip files which do not h +ave at least 3 nos and .jpg my $sub = ('0'x(3-length($1))).$1; # append 0s if req'd my $rest= $2; mkdir ($dir.$sub) unless -e ($dir.$sub); # create dir if req'd rename $_, $dir.$sub.'/'.$sub.$rest; # move file using rename }
    *untested code

    Dingus


    Enter any 47-digit prime number to continue.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://219630]
Approved by lemming
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-20 00:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found