kennedymr2 has asked for the wisdom of the Perl Monks concerning the following question:

I have a string, which is used to create an output file name I have to ensure there are no characters in the string which will cause an error when the file is opened for output. eg file!~$?/!@ .txt would cause an error ?Is there a sub routine which 1. checks for bad characters 2. removes bad characters Appreciate any ideas Regards kennedymr2

Replies are listed 'Best First'.
Re: Removing Bad Characters from a String
by jhourcle (Prior) on May 09, 2005 at 01:15 UTC

    The concept of 'bad' characters should be avoided.

    If you have reason to believe that there are characters that may cause problems, you will do better to make sure that you only keep characters that you know to be good. This keeps you from accidentally letting through characters that you hadn't thought of.

    It doesn't need to be a subroutine (as it's a whole one line), but you can do it so you're consistent between various points in the program (or cross program, depending on how you handle it.)

    There are three ways to handle this normally -- removal, replacement, and reversable:

    sub remove { my $string = shift; $string =~ s/[^a-zA-Z0-9.\-_]//g; return $string; } sub replace { my $string = shift; $string =~ s/[^a-zA-Z0-9.\-_]/_/g; return $string; } sub reversable { my $string = shift; $string =~ s/([^a-zA-Z0-9.\-_])/sprintf('=%x',unpack('C',$1))/eg; return $string; }
      Appreciate the help offered by all the members of this forum. This is the 1st time i have used this forum, and am very impressed with the reponse to my question. The server i am using is unix I have done some more testing, and the main character i seem to have a problem with is \ or /, where it seems to indicate a change in directory in the output file name I will use this the code offered by jhourcle to get around the problem. Also , thanks to all the other members who have given advice. Once again, really appreciate the help regards ,kennedymr2
Re: Removing Bad Characters from a String
by moot (Chaplain) on May 08, 2005 at 23:15 UTC
    $bad_chars = qr{[!~?/@]}; # remove any potential bad characters $filename =~ s/$bad_chars//g;
    You might want to read perldoc perlre for more.
Re: Removing Bad Characters from a String
by merlyn (Sage) on May 08, 2005 at 23:35 UTC
    In Unix, there are no "bad characters", except that NUL is interpreted as end-of-string, and "/" is reserved for a directory separator. What precisely are you asking about?

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      unless this is a windows machine, in which case there are more 'bad characters'

      From a windows error message:

      A filename cannot contain any of the following characters:
      / : * ? " < > |

        That's the list for NTFS, in FAT, more characters are banned. On the top of my head, I think the additional ones are \x20 + = [ ] %

        I think that control characters (\x00-\x1f but not \x7f IIRC) and backslash (\\) are also banned in filenames, despite that error message not mentioning them.

Re: Removing Bad Characters from a String
by ambrus (Abbot) on May 09, 2005 at 06:21 UTC

    Why don't you just try creating the file, and see if it succeeds or returns an error? See what error it gives if you create a file with a wrong filename (EINVAL maybe?) and check for that error.