I think you are going to have to decide just "how good" the regex really needs to be, this V2 kind of stuff could be tricky. It could be that something rather simple solves your problem or not...

One trick is to anchor at the the of the string with $ so that you can work "backwards". Below, I just capture the last string a string consisting of digits and "." characters in the string. I put a restriction of a minimum of 2 characters must exist. And I allow an optional "v" in the front. You can make this case insensitive by adding /i switch to the regex. v2 won't match because "2" is just one character, but if you had say V24, that would match. You have to decide whether this is "good enough" or not.

I don't know how big your project is (how many people involved), but sometimes agreeing to use something easy to parse like: hey folks for version in path name use: verxxx, is a good way to go.

Update: if this "tool" part of path is a key differentiator between "good paths" and "bad paths" add that like my commented out regex below.

Of course the easy answer is that if you are satisfied with your regex and it does what you want...just leave it alone! Perl regex is so fast that I seriously doubt that any slight imperfection will be noticeable at all in terms of performance. an goof in first version, I saw that I was matching 32 on that line, so I added a "/" qualifier for the match. Still not "perfect", but I think the question here is "good enough" or not.

I guess another update, brain isn't working great today...I got frustrated with the regex complications to ensure only the last matching string on the line was matched. One easy to deal with this is Perl array slice. You can just match them all and then take the "last one" via (below) with or without "/" required in front...any Perl array slice is a good tool for your toolbox as well as these short cuts like \d for digits etc.

(my $version) = (m|(v?[\d\.]{2,})|g)[-1]; (my $version) = (m|/(v?[\d\.]{2,})|g)[-1];
#!/usr/bin/perl -w use strict; my @paths = qw ( '/tool/a/r/V2/V2DepCheck/1109.2.1/V2DepCheck.pm' '/tool/a/r/p4/r/main/V2/V2DepCheck/169441/V2DepCheck.pm' '/tool/a/r/p4/r/branches/bd32b/V2/V2DepCheck/175507/V2DepCheck.pm' '/home/me/cvs/V2/V2DepCheck.pm' '/tool/a/r/boost/1.36.0' '/tool/a/r/cadence/itk/itkvd/v007'); foreach (@paths) { chomp ; #not need here, use if reading from file #(my $version) = (m|tool/a/r.*?/(v?[\d.]{2,}).*$|i); (my $version) = (m|/(v?[\d\.]{2,}).*?$|); defined($version)? print "$version\n" : print "undefined\n"; } __END__ prints: 1109.2.1 169441 175507 undefined 1.36.0 v007

In reply to Re: Using a regex to extract a version from a Un*x path by Marshall
in thread Using a regex to extract a version from a Un*x path by gcmandrake

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.