Meaning of the regex?? Help!!

sas429s has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Meaning of the regex?? Help!! by chakram88 (Pilgrim) on Jan 30, 2008 at 15:19 UTC
Allow me to introduce you to a handy CPAN module that I discovered by way of another kind monk some time ago (paying it forward if you will, even though I no longer recall who pointed me in this direction) YAPE::Regex::Explain -- an handy tool for explaining regular expressions. I use it frequently when I come across a regex that I don't understand. Here's a quick example based on your request: `#!/usr/bin/perl -wl use YAPE::Regex::Explain; my $regex = qr/[a-zA-Z][a-zA-Z][a-zA-Z]\d\d,\s+\interlock/; print YAPE::Regex::Explain->new($regex)->explain;` [download] You will see the `\interlock` yeilds an "Unrecognized escape '\i' passed through" --- which YAPE ignores. moritz pointed this out above. YAPE will produce the following output. (Notice that the '\i' has been removed from the regex. The regular expression: (?-imsx:[a-zA-Z][a-zA-Z][a-zA-Z]\d\d,\s+interlock) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- [a-zA-Z] any character of: 'a' to 'z', 'A' to 'Z' ---------------------------------------------------------------------- [a-zA-Z] any character of: 'a' to 'z', 'A' to 'Z' ---------------------------------------------------------------------- [a-zA-Z] any character of: 'a' to 'z', 'A' to 'Z' ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- \d digits (0-9) ---------------------------------------------------------------------- , ',' ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- interlock 'interlock' ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- [download]	[reply] [d/l] [select]
Re: Meaning of the regex?? Help!! by moritz (Cardinal) on Jan 30, 2008 at 13:56 UTC
It's "three letters, two digits, followed by a comma, by one or more whitespace characters, and" then... uhm... there is no `\i` defined for perl regexes. So probably just ".. the word interlock". split looks for this pattern, and returns the part left of the first match, and between the first and the second match (the `2` at the end of the line limits it to two items). The join concatenates the first part of the previous expression (that is, the part of the string before the first match) with `$date_intlck_rel_res`. (Update: I forgot the digits... johngg++)	[reply] [d/l] [select]
Re: Meaning of the regex?? Help!! by toolic (Bishop) on Jan 30, 2008 at 14:17 UTC
If you `use warnings;`, you should get a warning message stating: `Scalar value @words[0] better written as $words[0]` [download] So, you should use `$words[0]`. You may find it more straightforward to just use the concatenation operator, instead of join: `$line = $words[0] . $date_intlck_rel_res;` [download]	[reply] [d/l] [select]
Re: Meaning of the regex?? Help!! by johngg (Canon) on Jan 30, 2008 at 14:58 UTC
There's a couple of other things to note. You can use a `{n}` quantifier to get an exact number of things so the `[a-zA-Z][a-zA-Z][a-zA-Z]` could be written `[a-zA-Z]{3}`. (You can also use `{m,n}` for a range of occurances and `{m,}` for `m` or more occurances, but you can't do `{,n}` for up to `n` occurances). split is most commonly used with a pattern as it's first argument so your `split '...', ...` should perhaps be `split /.../, ...`. You can have an expression as the first argument but that is usually for patterns that may change at run time; since your expression is a single-quoted string which will not change you should be using a pattern. (Read the documentation for the special case of split'ing on the empty string `''`. I hope this is of interest. Cheers, JohnGG	[reply] [d/l] [select]
Re: Meaning of the regex?? Help!! by planetscape (Chancellor) on Jan 31, 2008 at 05:31 UTC
In addition to the Pattern Matching, Regular Expressions, and Parsing section of our very fine Tutorials, you may also find the following helpful: Mastering Regular Expressions perlrequick perlretut perlre YAPE::Regex::Explain GraphViz::Regex The Regex Coach (Win32 only, IIRC) HTH, planetscape	[reply]