in reply to simple parse question

#!/bin/perl -w use strict; use warnings; my ($filename, $lineOfCode); open (FH, "filename.txt")|| die "could not open filename.txt"; while (<FH>){ if (m/^(\d+)\s/){ $lineOfCode = $1; } if (m/\/(\w+\.?\w*)$/){ $filename = $1; } }

If there are many lines that you wish to save then you'll have to use an array instead for $filename and $lineOfCode.

Neil Watson
watson-wilson.ca

Replies are listed 'Best First'.
Re: Re: parse
by shelob101 (Sexton) on Jul 24, 2002 at 20:59 UTC
    Actually, there's a finer point of regular expressions at work against this perl code. It is as follows:

    if (m/\/(\w+\.?\w*)$/){ $filename = $1; }

    The assumption here is that the first "\/" means a literal forward slash, then the (\w+\.?\w*)$ means "some word characters, followed by (possibly) a literal period, followed by 0 or more word characters, then the end of line.

    However, the "\w" metacharacter is intended to match only alphanumerics and the underscore character, which leaves out a whole bevy of other characters which may be present in file names, e.g. spaces, hyphens, parenthesis, etc.

    Granted, "sensible" UNIX filenames often don't contain those characters because they are also often used as shell metacharacters, but this sample dataset looks suspiciously like DOS/Win32 filenames (X: being a giveway) and I can't count the number of times I've had to deal with filenames like "Sales Figures - Dec 19 - Dec 26.doc" and the like!

    One possible alternative regex which still gives all characters after the last forward slash to the end of the string is:

    /\/([^\/]+)$/
    Meaning "A literal forward slash, followed by anything that's NOT a forward slash, to the end of string

    Or, as was pointed out in another reply to this post, File::Basename is an alternative if you wish to extract the entire path expression and figure out the $filename from there

    Hope that helps,

    Paul

    When there is no wind, row.