katgirl has asked for the wisdom of the Perl Monks concerning the following question:

I have a script on my site which lets people upload files into an image gallery. Unfortunately, some people are uploading files with names like "me~swimming.jpg" and "J.R,ewing.jpg"

I tried to sort this by using this snippet:

$imagename =~ s/\W/\_/g;
But this gives me "me_swimming_jpg" and "J_R_ewing_jpg"

What's the best way to take out the non-alphanumeric symbols from the file name, but leave the .jpg extension intact?

Replies are listed 'Best First'.
Re: Changing file names as they are uploaded
by fireartist (Chaplain) on Sep 19, 2002 at 12:00 UTC
    This doesn't have the elegance of blakem's solution above, but it does let you define which extensions you will allow.
    $ext = $1 if ($imagename =~ s/(\.jpg|gif|png)$//); $imagename =~ s/\W/_/g; $imagename .= $ext if ($ext);
Re: Changing file names as they are uploaded
by Joost (Canon) on Sep 19, 2002 at 12:09 UTC
    How about:

    my $ext = ""; if ($imagename =~ s/(\.jpe?g|\.gif|\.png)\z//i) { # updated $ext = $1; } $imagename =~ s/\W/_/g; # updated $imagename = $imagename.$ext; # updated
    You do need to specify valid suffixes, but as you are only interested in images, it shouldn't be too much of a problem.

    Update: Fixed some bugs pointed out by blakem. That'll teach me to test my code :-)

    -- Joost downtime n. The period during which a system is error-free and immune from user input.
      tr/\W/_/
      tr doesn't work that way... You'll have to stick with s/// if you want to use character classes. Also your logic will add a '.' on to the end of a file w/o an extension. i.e. "xyzfile" => "xyzfile."

      -Blake

Re: Changing file names as they are uploaded
by BrowserUk (Patriarch) on Sep 19, 2002 at 15:38 UTC

    How about a one-liner...

    $imagename =~ s/^(.*?)(\.\w*)$/my $temp = $2; ($_ = $1) =~ s!\W!_!g; $ +_ .= $temp/e;

    It deals with your examples plus the most pathological tests I could think of.

    #! perl -sw use strict; my @filespecs = map{ $_ =~ s/^(.*?)(\.\w*)$/my $temp = $2; ($_ = $1) = +~ s!\W!_!g; $_ .= $temp/e; $_} <DATA>; print $_, $/ for @filespecs; __DATA__ me~swimming.jpg J.R,ewing.jpg A¬!£$%^&*()-_+={}[]'@;:~\#\"<>,./|\?.jpg pic.

    Output

    C:\test>199139 me_swimming.jpg J_R_ewing.jpg A___________________________________.jpg pic. C:\test>

    Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!
Re: Changing file names as they are uploaded
by davorg (Chancellor) on Sep 19, 2002 at 11:38 UTC

    Invert the character class.

    $imagename =~ s/[^\w\.]/_/g;
    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

      but this leaves all periods in the filename, whereas I think katgirl wanted only the 'extension' period left alone.
      What's the best way to take out the non-alphanumeric symbols from the file name, but leave the .jpg extension intact?
      wouldn't that still leave me with "J.R_ewing.jpg"?
        How about:
        my $rindex = rindex($imagename,'.'); $rindex = length($imagename) if $rindex == -1; substr($imagename,0,$rindex) =~ s/\W/_/g;
        It will exclude the extension from the substitution.

        Update: Fixed bug for filenames w/o extensions that end in a nonword char.

        Update2: Thought I'd golf this one a bit:

        substr($file,0,rindex($file,'.')%length"a$file") =~ s/\W/_/g;
        works correctly for $files such as "me~swimming.jpg", "J.R,ewing.jpg" and "abc;;;"

        -Blake

Re: Changing file names as they are uploaded
by Washizu (Scribe) on Sep 19, 2002 at 14:11 UTC
    I love regex problems:
    if ($filename =~ /(.*)\.(.*)?/ig) { my $fname = $1; my $ext = $2; $fname =~ s/\W/_/g; $filename = $fname . '.' . $ext; } else { $filename =~ s/\W/_/g; }
    The only case I could find where it breaks is when you have a filename containing a dot (.) without an extension, but I don't know if that's a real case or not.

    -----------------------------------
    Washizu
    Acoustic Rock

Re: Changing file names as they are uploaded
by helgi (Hermit) on Sep 19, 2002 at 14:00 UTC
    Here's one way:
    my @files = ("me~swimming.jpg" ,"J.R,ewing.jpg" , "#&%$&$=Ö-.GIF"); foreach my $file (@files) { my @name_parts = split /\W{1,}/,$file; my $extension = pop @name_parts; my $newname = (join '_',@name_parts) . "\.$extension"; print "$newname\n"; }
    Regards,
    Helgi Briem