Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number
by haukex (Archbishop) on Feb 14, 2017 at 22:48 UTC
|
Hi dk27,
In the example shown, the result for last element "john" is missing.
Marshall seems to be correct in that the problem occurs when the file does not end on a newline character, however the final line should still be read even without the newline - confirm by printing $line before the next statement. The problem appears (at least to me) to be your regular expression instead. Note that \s+ requires there to be whitespace after the name, even for phone book queries, which have a name only. In most lines of input, that whitespace is the newline at the end of the line, since you're not chomping the lines, but the last line is missing the newline so it does not match the regular expression.
Try this regular expression instead, it works for me: /^\s*([a-zA-Z]+)(?:\s+(\d+))?/
Hope this helps, -- Hauke D
| [reply] [d/l] [select] |
|
|
Hauke D, you are completely right!
My regex required at least one "space" character after the "name". A line ending ("\n" in Perl lingo) counts as a "space" character although in Windows "\n" might actually be a couple of characters.
This regex also appears to work.
next unless ($name,$phone) = $line =~/^\s*([a-zA-Z]+)\s*(\d+)?/;
Thanks haukex! | [reply] [d/l] |
|
|
Hi Hauke,
The regex you suggested works for me and qualifies for all other test cases.
Correct me if I am wrong but when you used the operator "?:", this is used to form a non-capturing group so that \s+ is explicitly not captured while checking for regex and this makes it a match to the last test case and matches that for execution right?
Thanks!
| [reply] |
|
|
Hi dk27,
when you used the operator "?:", this is used to form a non-capturing group so that \s+ is explicitly not captured while checking for regex and this makes it a match to the last test case and matches that for execution right?
Both (...) and (?:...) group regex patterns, so both /ab(cd)?/ and /ab(?:cd)?/ mean "match ab, optionally followed by cd", and both /ab(cd|ef)gh/ and /ab(?:cd|ef)gh/ mean "match abcdgh or abefgh". The difference between the two is that capturing groups (...) will populate the $1, $2, etc. variables, while non-capturing groups (?:...) will not populate those variables, so the latter are typically used for grouping only.
Here is how I read the regex I showed, /^\s*([a-zA-Z]+)(?:\s+(\d+))?/:
- ^: Anchor at beginning of string (Update: Note that if the /m modifier were in use, this would anchor to the beginning of any line within the string.)
- \s*: Zero or more whitespace characters
- ([a-zA-Z]+): One or more letters, storing the match in $1 (because this is the first set of capturing parentheses)
- (?:...)?: Optionally match the following:
- \s+: One or more whitespace characters
- (\d+): One or more digits, storing the result in $2 (because this is the second set of capturing parentheses)
If I had used (...) instead of (?:...), the match would be exactly the same, the difference would be that $2 would be populated with the match of (\s+(\d+)), and $3 would be populated with the match of (\d+). One would have had to write ($name, undef, $phone) = $line =~/^\s*([a-zA-Z]+)(\s+(\d+))?/, which is possible too, but is simply a bit of a waste.
For the documentation see perlretut, perlrequick, and perlre.
(As a side note, Perl v5.22 introduced the /n modifier, which turns all (...) in the regular expression into non-capturing groups.)
Hope this helps, -- Hauke D
| [reply] [d/l] [select] |
Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number
by Marshall (Canon) on Feb 14, 2017 at 20:51 UTC
|
I was able to re-create your symptom on my Win XP platform if the last line "john" does not have a line ending. In other words if the file EOF (End Of File) occurs right after "john" instead of there being a normal "end of line" sequence of characters before the EOF.
my $line =<STDIN> is a line oriented I/O method. It returns a line when it sees the EOL (End Of Line) character sequence. If EOF happens before EOL, then it (added to post: should return the line as it does with C readline) However, in my case this returns "false" so this last "not quite complete line" is not processed.
At the moment, I do not know how to get this unusual "john" last line processed. I am sure that there is a way, but right now I don't know it. Of course in your texteditor, end the john line with "enter" and all will be fine.
Update: The symptoms that I am seeing above are on an ancient Win XP laptop. The result completely surprised me! It would be helpful if the OP can
replicate my results.
Update2: Hauke D got it right at Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number
| [reply] [d/l] |
|
|
> If EOF happens before EOL, then it returns "false" so this last "not quite complete line" is not processed.
Such a behaviour would be strange, inconsistent, and against the documentation, see readline:
> each call reads and returns the next line until end-of-file is reached, whereupon the subsequent call returns undef .
I'd report it as a bug.
($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord
}map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
| [reply] [d/l] [select] |
|
|
This is indeed strange and contrary to my C experience. No doubt about that! I just re-ran my test cases and get the same results. This is the only way I could duplicate the OP's symptoms. This did surprise me! I did not expect this result.
I am using:
"This is perl 5, version 20, subversion 2 (v5.20.2) built for MSWin32-x86-multi-thread-64int
(with 1 registered patch, see perl -V for more detail)"
On Win XP with all patches before end of life.
It would be helpful if the OP or others could replicate this result on a different Windows version. Something is very odd about this. I agree! I suppose that there could be Windows problem with my ancient laptop. For a bug report, I'd like additional independent verification with exact versioning information. The more detail contained in the bug report and the easier the problem is to reproduce dramatically increases the probability that it will get fixed.
| [reply] |
|
|
|
|