Re: Perl program to look into the phone directory ...
by haukex (Archbishop) on Feb 12, 2017 at 10:22 UTC
|
Hi dk27,
First, some general things: I would suggest using proper indentation to make your code easier to read. See perltidy for a program that does it for you. Also, Use strict and warnings, it is a good habit to get into. (Update: Your code is simply missing the declarations my @data; my @value; to make it work under strict.)
Note that it is not necessary to read the entire input into an array and then process it. You can read the input line-by-line using a while (<>) loop (documented in I/O Operators). Also, you can read a single line of input (such as the number of entries) using <> in scalar context (see example below).
There appears to be some munging going on that your sample input does not explain. For example, what is split( /[\[\],]/, $s ) for?
You appear to first be checking whether an entry exists in the array @data using grep, which looks though the entire array, and if there is a match with the regex /^$N[$i]/, you search the entire @data array a second time, looking for an exact match. Either the first or the second check is not necessary, and note that in Perl, hashes are much easier to use for exact string lookups.
Here's how I might have written the same script:
- Read the number of phone book entries using my $numentries = <>;.
- Read the phone book entries themselves using a while (<>) loop, using last to break out of the loop when the number of entries has been reached. Each line of input can be broken down into name and number using split, and then stored in a hash, e.g. $phonedirectory{$name} = $number;.
- Read the rest of the input using a second while (<>) loop, using simple hash lookups ($phonedirectory{$name}) to fetch the numbers. You can use exists to see if a certain entry exists in the hash.
Hope this helps, -- Hauke D
| [reply] [d/l] [select] |
Re: Perl program to look into the phone directory and in case of a match, print the name along with the number
by Corion (Patriarch) on Feb 12, 2017 at 10:02 UTC
|
Maybe you can add some comments to your code about which section of the code deals with which section of the input.
In doing so, maybe also add some comments to your code about which format each part of the code expects its input to be in.
When writing these comments, you should find some errors in your code.
Note that "optimizing" should only come after your program works correctly, which it currently doesn't.
Also, this task very much looks like homework. I find it a better approach to be upfront about asking for help with homework because that allows us to give you the appropriate teaching links so that you not only pass your homework but also learn the necessary skills along the way.
| [reply] |
|
|
Hi Corion,
This is not a hw problem. This was a challenge from hackerrank and it seems to be working for all their test cases. Just that in 2 of the test cases, it gets timed out. So I was looking for a solution to optimize the code in order to run the code within the given time frame. I am a beginner and that's why the code may seem a bit messy.
| [reply] |
|
|
$s =~ s/\s+/=/g;
my @v = split( /[\[\],]/, $s ); #create an array from string $s
# assign hash value pairs and insert '=' between them to create phoneb
+ook
my %hash = map { split( /=/, $_, $x + 1 ) } @v;
entry
Why do you convert all whitespace in $s to = and then split on = again? What use is @v?
Also, the fasted way to do general lookups in Perl is to use a hash. In your case, the fastest data structure to look up the name from a number would be a hash that maps each number to a name:
my %reverse_phonebook = (
123 => 'jack',
456 => 'jill',
...
);
Building this data structure allows for quick repeated lookups of names if all you have is the number. | [reply] [d/l] [select] |
|
|
Hi dk27,
Just that in 2 of the test cases, it gets timed out.
I see the following inefficiencies, which I already touched upon:
- Reading the entire input into memory is unnecessary, it is enough to process the input line-by-line.
- As Corion already pointed out, you're doing what appears to be unnecessary work on the input. Unless there are other requirements for the input format that you haven't told us about?
- When looking up a phone number, you're searching the entire @data array twice, first with grep, then with for ( my $m = 0 ; $m <= $len1 ; $m++ ). Also, in the grep you're using a regular expression, which appears unnecessary and takes up CPU cycles since later on you're just doing an exact comparison with if ( $data[$m] eq $N[$i] ).
- Using grep is inefficient for finding a single array element, as it searches the entire array. There would be the alternative first from the core module List::Util, which returns the first match, but if all you need is exact matches, even better would be hash lookups.
Hope this helps, -- Hauke D
| [reply] [d/l] [select] |
|
|
# assign hash value pairs and insert '=' between them to create phoneb
+ook
my %hash = map { split( /=/, $_, $x + 1 ) } @v;
entry
should be:
# assign hash value pairs and insert '=' between them to create phoneb
+ook entry
my %hash = map { split( /=/, $_, $x + 1 ) } @v;
Please re-run the code as a double check before posting on Monks to find these simple cut-n-paste goofs. It is easy to wind up with a goof like this (I've certainly made similar mistakes), but do the best you can. If your code runs "right off the bat", even if there are problems with it, you will tend to get much better answers. Get rid of all of the compile errors and warnings to the extent that you can. And if you can't do that, post the error messages and get advice about them.
I don't know what kind of performance or other test case criteria are required. I can only go upon what you have posted. See my code at Re: Perl program to look into the phone directory and in case of a match, print the name along with the number.
In terms of performance, your code besides being pretty obtuse, has many loops and loops within loops.
Some hints:
- Input/Output (I/O) is expensive CPU wise. Even just locating the end-of-line character(s) in an input stream takes CPU power to examine every single input character.
- Looping is "expensive". There are loops within your code that you don't recognize as loops.Example: my @N = <STDIN>; That is a loop that builds a dynamically growing data structure for each line of input. Another example: my %hash = map { split( /=/, $_, $x + 1 ) } @v; Map is basically a short hand way of writing a foreach loop.
Some general principles:
- When processing an input file (or stream from STDIN), to the extent possible, try to take a definitive action based upon the information contained in that line. Do something with that information ASAP. In general making a copy of a file's data in memory, only to process that data again line by line to extract information from that data structure is a bad idea. Please note the distinction between "data" and "information". "Information" is what you get when you process the "raw data".
- Don't save stuff in memory when it is not necessary to do so.
I hope these comments and my code helps you learn more about Perl.
| [reply] [d/l] [select] |
Re: Perl program to look into the phone directory and in case of a match, print the name along with the number
by Marshall (Canon) on Feb 12, 2017 at 19:21 UTC
|
Your code appears to do one heck of a lot of unnecessary work!
My brain started hurting trying to figure it out. Rather than
making suggestions to your code, it was easier for me to just
re-code it in the hopes that this will be instructive to you.
I guess it is nice that the number of phone book entries
appears first. However, I consider it "noise" and throw it
away. There is no need to use it and this would not be known in
any "real" phone book application.
A valid line consists of a name, some space and optionally a phone number.
If we have the optional phone number, then this line is a
new phone book entry. If it is just a name, then just look up the
number in the phone_num hash and print the result. There is no need
to "keep track of the sections".
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %phone_num;
while (my $line =<DATA>)
{
my $name;
my $phone;
# this throws away malformed lines, like "4 (represents..."
next unless ($name,$phone) = $line =~/^\s*([a-zA-Z]+)\s+(\d+)?/;
if (defined $phone) # A new phone book entry
{
$phone_num{$name} = $phone;
}
else # Just a name. Look up its phone number
{
if ($phone_num{$name})
{
print "$name $phone_num{$name}\n";
}
else
{
print "Not found\n";
}
}
}
print "\nJust to show what phone_num hash is:\n";
print Dumper \%phone_num;
=Program Output:
Not found
john 334455
Not found
Just to show what phone_num hash is:
$VAR1 = {
'harry' => '112233',
'tom' => '332211',
'ryan' => '445566',
'john' => '334455'
};
=cut
__DATA__
4 (represents no. of entries in phone directory)
tom 332211
harry 112233
ryan 445566
john 334455
jack
john
jay
4576654 this is also junk line (just a number)
Now of course in a "real" application, this would be more complicated
because for example you would have to allow for there to be more than
one "john" each with a different phone number.
I hope this gets your thinking moving in more of a "Perl direction".
Have Fun.
As a comment, I encourage you to space the code out, using appropriate comments and sometimes just a blank line to separate "thought units". White space consumes no MIPs! But that and comments can be some of the most important "code" that you write! | [reply] [d/l] |
|
|
Thanks Marshall for the valuable input. Your code is crisp and easy to understand. I appreciate the tips provided by you, Corion and Hauke
Although I am not able to figure out why it doesnt display the result for the last name from STDIN. In the example shown, the result for last element "john" is missing.
Input:
4
tom 332211
harry 112233
ryan 445566
john 334455
jay
harry
ryan
kelly
john
Output:
Not found
harry=112233
ryan=445566
Not found
Code:
use strict;
use warnings;
use Data::Dumper;
my %phone_num;
while (my $line = <STDIN>)
{
my $name;
my $phone;
next unless ($name, $phone) = $line =~/^\s*([a-zA-Z]+)\s+(\d+)*/;
if (defined $phone) # A new phone book entry
{
$phone_num{$name} = $phone;
}
else # Just a name.
{
if ($phone_num{$name})
{
print "$name=$phone_num{$name}\n";
}
else
{
print "Not found\n";
}
}
}
The reason why I used a for loop inside a for loop was to get the correct value index to properly map name value pairs. Once I spliced the array N, I only have the names which needs to be checked for phonebook entry inside @N. Then I checked @N against the keys of the hash inside @data. Now the index i for @N need not be same for @data, that's why used another for loop to match the correct key-value pairs once the if condition i.e. grep holds good.
This kind of approach makes it a lot more cumbersome. Your algo and approach seems to be way better and fast. Thanks!
| [reply] [d/l] |
|
|
Hi dk27,
In the example shown, the result for last element "john" is missing.
Marshall seems to be correct in that the problem occurs when the file does not end on a newline character, however the final line should still be read even without the newline - confirm by printing $line before the next statement. The problem appears (at least to me) to be your regular expression instead. Note that \s+ requires there to be whitespace after the name, even for phone book queries, which have a name only. In most lines of input, that whitespace is the newline at the end of the line, since you're not chomping the lines, but the last line is missing the newline so it does not match the regular expression.
Try this regular expression instead, it works for me: /^\s*([a-zA-Z]+)(?:\s+(\d+))?/
Hope this helps, -- Hauke D
| [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
I was able to re-create your symptom on my Win XP platform if the last line "john" does not have a line ending. In other words if the file EOF (End Of File) occurs right after "john" instead of there being a normal "end of line" sequence of characters before the EOF.
my $line =<STDIN> is a line oriented I/O method. It returns a line when it sees the EOL (End Of Line) character sequence. If EOF happens before EOL, then it (added to post: should return the line as it does with C readline) However, in my case this returns "false" so this last "not quite complete line" is not processed.
At the moment, I do not know how to get this unusual "john" last line processed. I am sure that there is a way, but right now I don't know it. Of course in your texteditor, end the john line with "enter" and all will be fine.
Update: The symptoms that I am seeing above are on an ancient Win XP laptop. The result completely surprised me! It would be helpful if the OP can
replicate my results.
Update2: Hauke D got it right at Re^3: Perl program to look into the phone directory and in case of a match, print the name along with the number
| [reply] [d/l] |
|
|
|
|
|