blackadder has asked for the wisdom of the Perl Monks concerning the following question:

Hi All

This is simple for most of you guys but I have been trying for couple of hours now, and getting no where fast…

I have folder paths like the following,
C:\my folder
c:\perl
f:\cus shared folder.net-03\shated-data
D:\Jan03\acc\emp acc
s:\top lever\lowerdirs
w:\share_A\scin share folder\docs

I need to capture the drive letter and the top-level folders only. I used the code below,
use strict; my $path = shift; $path =~ /^(\w):\\(.+)\\?/; print "Drive : $1, Top Level : $2\n";
However it’s failing, I understand that the error is the \\? bit of the regex. What I am trying to say is, grab the first character and anything in between the first set of 2 back slashes or to the end of the line. It’s easier said than done for me.

Thanks

Replies are listed 'Best First'.
Re: filtering folder path using regex.
by PodMaster (Abbot) on Jul 13, 2003 at 13:56 UTC
    #!/usr/bin/perl -l my $yo = 'C:/as/d/f/e/r/g/a/d/f/g/'; use File::Spec; print for File::Spec->splitpath($yo); warn $yo; print for File::Spec->splitdir($yo); die $yo; __END__ C: /as/d/f/e/r/g/a/d/f/g/ C:/as/d/f/e/r/g/a/d/f/g/ at - line 5. C: as d f e r g a d f g C:/as/d/f/e/r/g/a/d/f/g/ at - line 7.
    Now onto the regex. Pay special attention to the ".+" explanation, and then choose a different quantifier(*?) add a modifier (namely "?").
    use YAPE::Regex::Explain; die YAPE::Regex::Explain->new(qr/^(\w):\\(.+)\\?/)->explain; __END__ The regular expression: (?-imsx:^(\w):\\(.+)\\?) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \w word characters (a-z, A-Z, 0-9, _) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- : ':' ---------------------------------------------------------------------- \\ '\' ---------------------------------------------------------------------- ( group and capture to \2: ---------------------------------------------------------------------- .+ any character except \n (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \2 ---------------------------------------------------------------------- \\? '\' (optional (matching the most amount possible)) ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
    update: yuckyish (File::Spec rocks)
    perl -le"print for split m{[:\\/]+}, shift, 3" C:\y\o\d\a C y o\d\a

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      WOW, the power of Perl "...it helps if you know what you are doing", I didn't know that there things like YAPE::Regex::Explain; to explain it all....Thanks.
        Thanks PodMaster.... Oh forget Regex, for what I need, File::Spec is really what I should use. This is magic.
      \\? '\' (optional (matching the most amount possible))

      Is that correct? I always understood '?' to mean exactly 0 or 1 occurances, '*' or '+' would be the most possible.

      --
      Barbie | Birmingham Perl Mongers | http://birmingham.pm.org/

        The question to ask yourself is: Is it incorrect? The answer is no.

        perlre says "? Match 1 or 0 times", which means optional (like YAPE::Regex::Explain says).

        Now, the extra "(matching the most amount possible)" comment may seem a bit misleading, but it's not wrong. Consider the re-pattern f?. It means match "f" optionally. Since "f" is fixed with, it'll always match the most amount possible. This may seem a bit silly, and may or may not be an unintended sideeffect of how YAPE::Regex::Explain works, but it's perfectly fine by me (if it is unintended, I don't think it's worth a workaround, if it's not, there is no issue).

        MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
        I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
        ** The third rule of perl club is a statement of fact: pod is sexy.

Re: filtering folder path using regex.
by BrowserUk (Patriarch) on Jul 13, 2003 at 13:59 UTC

    Your using a greedy regex '\\.+\\', which will grab as much as it can. You can either use a nongreedy version

    $path =~ /^(\w):\\(.+?)\\?/;

    Or better, match anything except the terminator char (\).

    $path =~ /^(\w):\\([^\\]+)\\?/;

    That said, you'd almost certainly be better of using the File::Spec module that is a part of the core for this sort of thing.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


      Thanks Sir,

      the $path =~ /^(\w):\\(^\\+)\\?/ worked for me, but what I don't understand how does it get the $2 value? There is no “.” or “\w” in the second parenthesis. Oh well, as long as it works…
        I think you were talking about this one:
        $path =~ /^(\w):\\([^\\]+)\\?/;
        In this one [\\] would match just backslashes, but with the ^, [^\\], matches all characters Except backslashes, and because of the +, it matches as much as possible up to the 1st backslash. $2 is populates, because the character class (ie. [^\\]) is in parenthesis, which happens to be the second set of them, so therefore is stored in $2
Re: filtering folder path using regex.
by Zaxo (Archbishop) on Jul 13, 2003 at 14:01 UTC

    You should take a look at File::Spec. What you show con be done with File::Spec->splitpath and File::Spec->splitdir. That approach will make your code more portable, and the parsing more accurate.

    After Compline,
    Zaxo

      Thanks Zaxo,

      File::Spec,....I need to lean this one too,...There is no end to Perl libraries,..cheers?
        You may want to test the input "\\someserverhere\c$\foo\bar" as well as it is a valid path.

        -Waswas
Re: filtering folder path using regex.
by mildside (Friar) on Jul 14, 2003 at 01:13 UTC

    Another core module that can be used for this purpose is: File::Basename.

    However, File::Spec looks a lot more powerful!

    Cheers

Re: filtering folder path using regex.
by l2kashe (Deacon) on Jul 14, 2003 at 04:05 UTC
    Another non-regex way to get this data is.
    use strict; my $path = shift; my($drive, $top) = ( split('\', $path) )[0,1];
    Im note sure if the split arg there should be \, or \\.

    MMMMM... Chocolaty Perl Goodness.....
      O' this is neat, I did think about split and splice to start off with, but then I went the Regex way!. Thanks brethren.

      I have found many Regex and pattern matching related tutorials on this site. Which all made a nice read over the very warm weekend.Anyway...

      God blesses.