in reply to Re^2: Perl program to look into the phone directory and in case of a match, print the name along with the number
in thread Perl program to look into the phone directory and in case of a match, print the name along with the number

Hi dk27,

In the example shown, the result for last element "john" is missing.

Marshall seems to be correct in that the problem occurs when the file does not end on a newline character, however the final line should still be read even without the newline - confirm by printing $line before the next statement. The problem appears (at least to me) to be your regular expression instead. Note that \s+ requires there to be whitespace after the name, even for phone book queries, which have a name only. In most lines of input, that whitespace is the newline at the end of the line, since you're not chomping the lines, but the last line is missing the newline so it does not match the regular expression.

Try this regular expression instead, it works for me: /^\s*([a-zA-Z]+)(?:\s+(\d+))?/

Hope this helps,
-- Hauke D

  • Comment on Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number
  • Select or Download Code

Replies are listed 'Best First'.
Re^4: Perl program to look into the phone directory and in case of a match, print the name along with the number
by Marshall (Canon) on Feb 14, 2017 at 23:07 UTC
    Hauke D, you are completely right!

    My regex required at least one "space" character after the "name". A line ending ("\n" in Perl lingo) counts as a "space" character although in Windows "\n" might actually be a couple of characters.

    This regex also appears to work.

    next unless ($name,$phone) = $line =~/^\s*([a-zA-Z]+)\s*(\d+)?/;
    Thanks haukex!
Re^4: Perl program to look into the phone directory and in case of a match, print the name along with the number
by dk27 (Novice) on Feb 15, 2017 at 06:09 UTC

    Hi Hauke,
    The regex you suggested works for me and qualifies for all other test cases.
    Correct me if I am wrong but when you used the operator "?:", this is used to form a non-capturing group so that \s+ is explicitly not captured while checking for regex and this makes it a match to the last test case and matches that for execution right?
    Thanks!

      Hi dk27,

      when you used the operator "?:", this is used to form a non-capturing group so that \s+ is explicitly not captured while checking for regex and this makes it a match to the last test case and matches that for execution right?

      Both (...) and (?:...) group regex patterns, so both /ab(cd)?/ and /ab(?:cd)?/ mean "match ab, optionally followed by cd", and both /ab(cd|ef)gh/ and /ab(?:cd|ef)gh/ mean "match abcdgh or abefgh". The difference between the two is that capturing groups (...) will populate the $1, $2, etc. variables, while non-capturing groups (?:...) will not populate those variables, so the latter are typically used for grouping only.

      Here is how I read the regex I showed, /^\s*([a-zA-Z]+)(?:\s+(\d+))?/:

      1. ^: Anchor at beginning of string (Update: Note that if the /m modifier were in use, this would anchor to the beginning of any line within the string.)
      2. \s*: Zero or more whitespace characters
      3. ([a-zA-Z]+): One or more letters, storing the match in $1 (because this is the first set of capturing parentheses)
      4. (?:...)?: Optionally match the following:
        1. \s+: One or more whitespace characters
        2. (\d+): One or more digits, storing the result in $2 (because this is the second set of capturing parentheses)

      If I had used (...) instead of (?:...), the match would be exactly the same, the difference would be that $2 would be populated with the match of (\s+(\d+)), and $3 would be populated with the match of (\d+). One would have had to write ($name, undef, $phone) = $line =~/^\s*([a-zA-Z]+)(\s+(\d+))?/, which is possible too, but is simply a bit of a waste.

      For the documentation see perlretut, perlrequick, and perlre.

      (As a side note, Perl v5.22 introduced the /n modifier, which turns all (...) in the regular expression into non-capturing groups.)

      Hope this helps,
      -- Hauke D