rookie_monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all Monks, I need help with a program that I am writing. The logic is not robust enuff for what I need it to do in order to cover unexpected file names and user input. Below is the code and the comments should help explain the issues that I could possible have. How would you change the program to make it more robust? As you can see my programming skill isnt very good but Im trying. Thanks in advance everyone.
$filepath = "C:\\Perl scripts\\test"; # $GetName is a filename that they want to use as a template # to change all the filename that are similar within a # directory(user input, they will add the parentheses also). # The parentheses () are used for a parameter # they want to keep for adding to the final file name $GetName = "test_0(02.txt)"; # $StartCh is the 1st number they want to start with, so # for this example $StartCh = 1; $EndtCh = 12; # $SetName is the final filename so for example # the 1st file lets say is test_001.txt. # Based on the $GetName and the SetName the final filename # should be changed to "Chapter 01.txt". $SetName = "Chapter \1"; opendir (DIR, $filepath) || die("Cannot open directory"); while ($file = readdir DIR) { print "$file\n"; # Matches for anything before parentheses if ($GetName =~ m/(.*?)\(/) { $subname = $1; if ($file =~ /^$subname/) { for ($i = $StartCh; $i <= $EndtCh ; $i++) { # This reg ex could possible match alot more than I wa +nt. # It would give me the wrong result, for example # if the filename is "test_005_1.txt", it would pick # it up even though thats not what I want if ($file =~ /$i/) { # This is where I am going to have problems # especially if they do something like this # $GetName = "test_(02-092010.doc)" the # number of digits would be totally off my(@origdig) = ($GetName =~ m/\d/g); my $orig_cnt = scalar(@origdig); if ($StartCh => 1 and $EndtCh =< 9 and $orig_cnt=1 +) { $num1 = $origdig[0]; $replace = $subname . $i; } if ($StartCh => 1 and $EndtCh =< 9 and $orig_cnt=2 +) { $num1 = "0" . $origdig[0]; $replace = $subname . $i; } # In case they may include the extra 0 # example $GetName = "test_(002.txt)". if ($StartCh => 1 and $EndtCh =< 9 and $orig_cnt=3 +) { $num1 = "00" . $origdig[0]; $replace = $subname . $i; } # More If stmts to do the same for >= 10 # How would I go about writing this to be more # efficient and be cleaner? } } } else { print "$file does not match\n"; } } else { print "GetName didnt have parentheses\n"; } } closedir DIR; Example of file names in the directory in this case: test_001.txt test_002.txt ... ... test_012.txt

Replies are listed 'Best First'.
Re: Help with my rookie logic
by graff (Chancellor) on Sep 17, 2010 at 01:12 UTC
    JavaFan has nailed the essential problems (as he usually does), but I'll add a few more (which are admittedly less important):

    There's nothing inside your "while" loop that changes the value of $GetName, yet every iteration of your loop checks whether the value of $GetName contains an open paren. You could just check for this once, before going into the loop.

    The inner "for" loop checks whether the numbers "1" through "12" occur in the current file name, but the way you're doing it, a file name containing "21" will match on two iterations (because it contains "1" and "2") and a name containing "1123" will match five times ("1", "2", "3", "11" and"12").

    There seems to be a dependency between the value of $GetName and the values of $StartCh and $EndCh, but I'm not sure how these are related, exactly. If you can clear that up, things might make more sense.

    If you are dealing with file names that use fixed-width digit strings with leading zeros, you might want to use a regex that checks for a specific number of digits between non-digit characters -- for example, if you're looking for file names in the range "test_0001.txt" through "test_0012.txt":

    my $digit_width = 4; my $min = 1; my $max = 12; my $prefix = "test_"; my $ext = ".txt"; while ( my $file = readdir DIR ) { next unless ( $file =~ /^ $prefix (\d{$digit_width}) $ext $/x ); if ( $1 >= $min and $1 <= $max ) { # we have a match... now what? } }
      Yes there is a dependency on the $GetName value because that is what the user is going to input for the program to use as a search criteria when it scans the directory for files. The $StarCh and $EndCh are basically chapter numbers. starting and ending chapters. Each of these files that will be in the directory The files will have the chapter number within the file name somewhere. For example "Test_001.txt" will be chapter 1 file but depending on who created the files, someone could have a totally different naming convention such as "Ch 1 title 3rd edition". Something like this will throw off my whole program and thats where I need the program to be more robust. I need code to handle many possibilities. Now thinking about it, maybe there isnt a way for to actually code all possibilties. Thanks for your input and some of your suggestions does help me for what its worth. Thanks!
      Using your code: my $digit_width = 4; # This isnt necessarily true every time my $min = 1; my $max = 12; my $prefix = "test_"; my $ext = ".txt"; while ( my $file = readdir DIR ) { next unless ( $file =~ /^ $prefix (\d{$digit_width}) $ext $/x ); if ( $1 >= $min and $1 <= $max ) { # we have a match... now now we want to create the new file na +me # Whatever $SetName was so if $SetName = "Chapter \1" which me +ans # The new file name for the program to modify in the directory + is # to "Chapter 01.txt" for the 1st chapter file which wouldve b +een # Test_001.txt in the directory it searched in. I will then # have the code rename the file. } }
Re: Help with my rookie logic
by toolic (Bishop) on Sep 17, 2010 at 00:49 UTC
    If I download your code and try to run it, I get a compile error:
    Unterminated <> operator

    The following:

    ... and $orig_cnt=1)

    is an assignment. It should probably be a numeric check (==):

    ... and $orig_cnt==1)

    I also see that in two other places in your code.

    Other advice:

Re: Help with my rookie logic
by JavaFan (Canon) on Sep 16, 2010 at 22:27 UTC
    To make programs robust, you need to have a clear understanding of what the program needs to do. Aka, you need a spec. I couldn't deduce a notion of what the program needs to do from the rambling in the comments, so no pointers of me on how to make it more robust.

    I did notice however a curious string: "Chapter \1". It made me wonder what you want to do with the ctrl-A, aka "SOH" character. But then I noticed it's assigned to a variable that isn't further used.

      Yes the string "Chapter \1" is a string I would get from an textbox which the user inputs this text, the backslash is suppose to be a real backslash so maybe the string probably should look like this: "Chapter \\1" perhaps.
Re: Help with my rookie logic
by apl (Monsignor) on Sep 17, 2010 at 12:28 UTC
    opendir (DIR, $filepath) || die("Cannot open directory");
    should be modified to
    opendir (DIR, $filepath) || die("Cannot open directory '$filepath': $!");

    Similarly,
    print "$file does not match\n";
    could be modified to
    print "'$file' does not match '$subname'\n";