I, too, very much like JavaFan's approach of Re: Using a regex to extract a version from a Un*x path. However, since I've already composed this reply, you might as well see it.

To avoid the confusion introduced by the presence of  'V2' in some paths, I depend on the presence of the magical  'V2DepCheck' sub-string. (JavaFan neatly avoids this issue by parsing right-to-left, but the regex approach I use must parse left-to-right.) Many more regexes are defined than in other approaches, but I find that it sometimes pays to be painfully explicit when the problem set is ill-defined and mutable, and maintenance may be an issue.

Code:

use warnings; use strict; my ($dotted, $digits, $v_num) = do { # - defining regex components and regexes in a do-block # avoids propagation of a bunch of extraneous lexicals. # - NONE OF THESE REGEXES MAY HAVE CAPTURE GROUPS. # capture groups in any of the 'private' regexes defined in # this do-block will tend to confuse capture group counting # in the regex in which they are ultimately used. # (assumes perl 5.8. regex enhancements of 5.10+ ease this # restriction considerably.) # - except as noted, NONE OF THESE REGEXES MAY HAVE ELEMENTS # THAT ARE CAPTURED, and should all be or be used within # zero-width look-around assertions. my $pathsep = qr{ [/\\] }xms; my $v_tag = qr{ V2DepCheck }xms; my $after_pathsep = qr{ (?<= $pathsep) }xms; my $before_pathsep = qr{ (?= $pathsep) }xms; my $before_eos_or_pathsep = qr{ (?= \z | $pathsep) }xms; my $after_v_tag = qr{ (?<= $v_tag $pathsep) }xms; my $no_v_tag = qr{ (?! $pathsep $v_tag) }xms; # validation assertions for various types of version numbers. my $ok_pre_dotted = qr{ $after_pathsep }xms; my $ok_post_dotted = qr{ $before_eos_or_pathsep }xms; my $ok_pre_digits = qr{ $after_v_tag }xms; my $ok_post_digits = qr{ $before_pathsep }xms; my $ok_pre_v_num = qr{ (?<= $pathsep [Vv]) }xms; my $ok_post_v_num = qr{ $no_v_tag $before_eos_or_pathsep }xms; # any of the regexes that follow may have captured elements. my $digits = qr{ \d+ }xms; my $dotted = qr{ $digits (?: \. $digits)+ }xms; # define regexes returned by do-block. qr{ $ok_pre_dotted $dotted $ok_post_dotted }xms, # $dotted qr{ $ok_pre_digits $digits $ok_post_digits }xms, # $digits qr{ $ok_pre_v_num $digits $ok_post_v_num }xms; # $v_num }; while (<DATA>) { chomp; my $ver = '?????'; my $indent = ''; if (m{ ($dotted | $digits | $v_num) }xms) { $ver = "'$1'"; $indent = ' ' x $-[1]; } print "str: '$_' \n"; print "ver: $indent$ver \n"; } __DATA__ /tool/a/r/V2/V2DepCheck/1.109.2.1/V2DepCheck.pm /tool/a/r/p4/r/main/V2/V2DepCheck/169441/V2DepCheck.pm /tool/a/r/p4/r/branches/bd32b/V2/V2DepCheck/175507/V2DepCheck.pm /home/me/cvs/V2/V2DepCheck.pm /tool/a/r/boost/1.36.0 /tool/a/r/cadence/itk/itkvd/v007

Output:

>perl extract_ver_1.pl str: '/tool/a/r/V2/V2DepCheck/1.109.2.1/V2DepCheck.pm' ver: '1.109.2.1' str: '/tool/a/r/p4/r/main/V2/V2DepCheck/169441/V2DepCheck.pm' ver: '169441' str: '/tool/a/r/p4/r/branches/bd32b/V2/V2DepCheck/175507/V2DepCheck.pm +' ver: '175507' str: '/home/me/cvs/V2/V2DepCheck.pm' ver: ????? str: '/tool/a/r/boost/1.36.0' ver: '1.36.0' str: '/tool/a/r/cadence/itk/itkvd/v007' ver: '007'

In reply to Re^3: Using a regex to extract a version from a Un*x path by AnomalousMonk
in thread Using a regex to extract a version from a Un*x path by gcmandrake

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.