in reply to Re: Perl program to look into the phone directory and in case of a match, print the name along with the number
in thread Perl program to look into the phone directory and in case of a match, print the name along with the number

Thanks Marshall for the valuable input. Your code is crisp and easy to understand. I appreciate the tips provided by you, Corion and Hauke Although I am not able to figure out why it doesnt display the result for the last name from STDIN. In the example shown, the result for last element "john" is missing.

Input: 4 tom 332211 harry 112233 ryan 445566 john 334455 jay harry ryan kelly john Output: Not found harry=112233 ryan=445566 Not found Code: use strict; use warnings; use Data::Dumper; my %phone_num; while (my $line = <STDIN>) { my $name; my $phone; next unless ($name, $phone) = $line =~/^\s*([a-zA-Z]+)\s+(\d+)*/; if (defined $phone) # A new phone book entry { $phone_num{$name} = $phone; } else # Just a name. { if ($phone_num{$name}) { print "$name=$phone_num{$name}\n"; } else { print "Not found\n"; } } }

The reason why I used a for loop inside a for loop was to get the correct value index to properly map name value pairs. Once I spliced the array N, I only have the names which needs to be checked for phonebook entry inside @N. Then I checked @N against the keys of the hash inside @data. Now the index i for @N need not be same for @data, that's why used another for loop to match the correct key-value pairs once the if condition i.e. grep holds good. This kind of approach makes it a lot more cumbersome. Your algo and approach seems to be way better and fast. Thanks!

  • Comment on Re^2: Perl program to look into the phone directory and in case of a match, print the name along with the number
  • Download Code

Replies are listed 'Best First'.
Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number
by haukex (Archbishop) on Feb 14, 2017 at 22:48 UTC

    Hi dk27,

    In the example shown, the result for last element "john" is missing.

    Marshall seems to be correct in that the problem occurs when the file does not end on a newline character, however the final line should still be read even without the newline - confirm by printing $line before the next statement. The problem appears (at least to me) to be your regular expression instead. Note that \s+ requires there to be whitespace after the name, even for phone book queries, which have a name only. In most lines of input, that whitespace is the newline at the end of the line, since you're not chomping the lines, but the last line is missing the newline so it does not match the regular expression.

    Try this regular expression instead, it works for me: /^\s*([a-zA-Z]+)(?:\s+(\d+))?/

    Hope this helps,
    -- Hauke D

      Hauke D, you are completely right!

      My regex required at least one "space" character after the "name". A line ending ("\n" in Perl lingo) counts as a "space" character although in Windows "\n" might actually be a couple of characters.

      This regex also appears to work.

      next unless ($name,$phone) = $line =~/^\s*([a-zA-Z]+)\s*(\d+)?/;
      Thanks haukex!

      Hi Hauke,
      The regex you suggested works for me and qualifies for all other test cases.
      Correct me if I am wrong but when you used the operator "?:", this is used to form a non-capturing group so that \s+ is explicitly not captured while checking for regex and this makes it a match to the last test case and matches that for execution right?
      Thanks!

        Hi dk27,

        when you used the operator "?:", this is used to form a non-capturing group so that \s+ is explicitly not captured while checking for regex and this makes it a match to the last test case and matches that for execution right?

        Both (...) and (?:...) group regex patterns, so both /ab(cd)?/ and /ab(?:cd)?/ mean "match ab, optionally followed by cd", and both /ab(cd|ef)gh/ and /ab(?:cd|ef)gh/ mean "match abcdgh or abefgh". The difference between the two is that capturing groups (...) will populate the $1, $2, etc. variables, while non-capturing groups (?:...) will not populate those variables, so the latter are typically used for grouping only.

        Here is how I read the regex I showed, /^\s*([a-zA-Z]+)(?:\s+(\d+))?/:

        1. ^: Anchor at beginning of string (Update: Note that if the /m modifier were in use, this would anchor to the beginning of any line within the string.)
        2. \s*: Zero or more whitespace characters
        3. ([a-zA-Z]+): One or more letters, storing the match in $1 (because this is the first set of capturing parentheses)
        4. (?:...)?: Optionally match the following:
          1. \s+: One or more whitespace characters
          2. (\d+): One or more digits, storing the result in $2 (because this is the second set of capturing parentheses)

        If I had used (...) instead of (?:...), the match would be exactly the same, the difference would be that $2 would be populated with the match of (\s+(\d+)), and $3 would be populated with the match of (\d+). One would have had to write ($name, undef, $phone) = $line =~/^\s*([a-zA-Z]+)(\s+(\d+))?/, which is possible too, but is simply a bit of a waste.

        For the documentation see perlretut, perlrequick, and perlre.

        (As a side note, Perl v5.22 introduced the /n modifier, which turns all (...) in the regular expression into non-capturing groups.)

        Hope this helps,
        -- Hauke D

Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number
by Marshall (Canon) on Feb 14, 2017 at 20:51 UTC
    I was able to re-create your symptom on my Win XP platform if the last line "john" does not have a line ending. In other words if the file EOF (End Of File) occurs right after "john" instead of there being a normal "end of line" sequence of characters before the EOF.

    my $line =<STDIN> is a line oriented I/O method. It returns a line when it sees the EOL (End Of Line) character sequence. If EOF happens before EOL, then it (added to post: should return the line as it does with C readline) However, in my case this returns "false" so this last "not quite complete line" is not processed.

    At the moment, I do not know how to get this unusual "john" last line processed. I am sure that there is a way, but right now I don't know it. Of course in your texteditor, end the john line with "enter" and all will be fine.

    Update: The symptoms that I am seeing above are on an ancient Win XP laptop. The result completely surprised me! It would be helpful if the OP can replicate my results.

    Update2: Hauke D got it right at Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number

      > If EOF happens before EOL, then it returns "false" so this last "not quite complete line" is not processed.

      Such a behaviour would be strange, inconsistent, and against the documentation, see readline:

      > each call reads and returns the next line until end-of-file is reached, whereupon the subsequent call returns undef .

      I'd report it as a bug.

      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
        This is indeed strange and contrary to my C experience. No doubt about that! I just re-ran my test cases and get the same results. This is the only way I could duplicate the OP's symptoms. This did surprise me! I did not expect this result.

        I am using:
        "This is perl 5, version 20, subversion 2 (v5.20.2) built for MSWin32-x86-multi-thread-64int (with 1 registered patch, see perl -V for more detail)"
        On Win XP with all patches before end of life.

        It would be helpful if the OP or others could replicate this result on a different Windows version. Something is very odd about this. I agree! I suppose that there could be Windows problem with my ancient laptop. For a bug report, I'd like additional independent verification with exact versioning information. The more detail contained in the bug report and the easier the problem is to reproduce dramatically increases the probability that it will get fixed.