newatperl has asked for the wisdom of the Perl Monks concerning the following question:

I trying to split the string:
C:\Documents and Settings\Administrator\Desktop\china.txt
(this string is generated when I upload a file using the upload file form element part of CGI.pm) so that I can grab the file extension of the file. I'm trying to do this by
my $pubupload = $query->param('pubupload'); my ($junk, $extension) = split / ./, $pubupload;
but this puts C:\Documents into $junk and nd into $extension. Can anyone help me out with this? Thanks.

Replies are listed 'Best First'.
Re: using split on a string
by MeowChow (Vicar) on May 14, 2001 at 07:30 UTC
    Sure. Your problems are that:
    • You have a space inside your split pattern. The space is treated literally, and is a required part of the match, which you obviously don't want.
    • You have a dot (.) inside your pattern. The dot has special meaning inside a pattern, which is to match any character. If you want to split on a literal dot, you need to escape it with a backslash.
    • If you correct the two problems above, you will find that your pattern will fail on some directories or files that have dot's in their names before the extension, such as C:\mydir.old\foo.txt. To fix this, you should use a pattern match instead of split, and anchor your match to the end of the string.
    After these fixes, your code should look something like:
    my ($extension) = $pubload =~ /\.(\w+)$/;
    update: as chromatic pointed out, the /g modifier on my regex was unnecessary, so I nixed it.
       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print
      Thanks MeowChow, your line worked like a charm. Now I'm just trying to dechiper everything you did. Thanks again!
Re: using split on a string
by Daddio (Chaplain) on May 14, 2001 at 07:21 UTC

    First, your split seems to be a bit off. You are asking for a "space and one character" and I think you are looking for "dot and anything else." If that is the case, try this for the split:

    my ($junk, $extension) = split /\./, $pubupload;

    That should get you this:

    $junk = C:\Documents and Settings\Administrator\Desktop\china $extension = txt

    Hope that helps...

    D a d d i o

Re: using split on a string
by damian1301 (Curate) on May 14, 2001 at 07:25 UTC
    You are splitting on a whitespace so it only splits it once (because you only have two variables to throw it in) and throws the rest out. What you want is:
    my $pubupload = q"C:\Documents and Settings\Administrator\Desktop\chin +a.txt"; $pubupload =~ /(?:.*)\\(.*?\..*)$/; print $1;

    When ran, the code gives this output:
    china.txt
    And of course to fine the extension, you can split it up from there.

    ($junk, $extension)=split/\./,$1;/;
    NOTE: I couldn't get split/\./,$pubupload; to work on the whole string, any ideas why?

    Tiptoeing up to a Perl hacker.
    Dave AKA damian

Re: using split on a string
by Masem (Monsignor) on May 14, 2001 at 07:23 UTC
    The period is a special character for regex's, and will match any other character. So / ./ will match a space followed by any other character; in this case, the first match is right after Documents; the split will also get "nd", and "ettings\Administrator\Desktop\china.txt", and return all 3 parts as an array, of which you only grab the first two parts.

    What you want instead is to split with /\./. The backslash 'escapes' the period so that the regex doesn't interprete it as "match any character" but instead as "match the period". This will give you the split you are looking for.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: using split on a string
by converter (Priest) on May 14, 2001 at 07:55 UTC
    If all you want is the filename extension, try a regular expression pattern with capturing group:
    if ($pubupload =~ m![^.]\.([^./\\]+)$!) { $extension = $1; }
    This pattern will match at least one non-dot character, a dot, then capture one or more non . / and \ characters until the end of string.
    $pubupload = 'C:\Documents and Settings\Administrator\Desktop\china.tx +t'; DB<1> if ($pubupload =~ m![^.]\.([^./\\]+)$!) { $extension = $1; } DB<2> print "[$extension]" [txt]

    Update:

    Modified pattern so that it won't be fooled by dot files (filenames with leading dot) like .notanextension and will require at least one non-dot character in the filename, for example: a.txt.

Re: using split on a string
by jorg (Friar) on May 14, 2001 at 16:33 UTC
    You could forget all this regex stuff and use a standard Perl module instead :
    use File::Basename; use Data::Dumper; my $path='C:\Documents and Settings\Administrator\Desktop\china.txt'; my @fileinfo=fileparse($path, '\..*'); print Dumper(@fileinfo);
    will give you
    $VAR1 = 'china';
    $VAR2 = 'C:\\Documents and Settings\\Administrator\\Desktop\\';
    $VAR3 = '.txt';
    The fileparse() function takes a path as first argument and a pattern describing the extension as second arg

    Jorg

    "Do or do not, there is no try" -- Yoda
Re: using split on a string
by AidanLee (Chaplain) on May 14, 2001 at 07:39 UTC

    this is because your regex matches a space followed by any character. What you want is

    split /\./, $pubupload;