add missing elements and remove duplicates....and some more questions.

pritesh_ugrankar has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I've been here since past few weeks and gleaned a lot of knowledge here. I recently started getting back to Perl, the only language that I know, for fun and may be more than fun in future. I had last week posted a question anonymously and got astounding replies and encouragement to post questions here with my login name. Hence this time, I have logged in with my name and asking questions.

I was asked to come up with the following without using sorting.

Given a random series of numbers, which may include hex numbers, write a script that will remove the duplicates, fill in the gaps if any, and print the numbers in a sequence. Do not use hash, do not use any existing libraries, but "do it yourself" as much as you can.

So I came up with this

Update: Please disregard the code below and refer to the Important Update section.

$ more /home/pritesh/nosort.pl use warnings; use strict; my @arr = (10,10,20,0x47,1,30,45,45); sub do_it_all { my $biggest = shift @_; foreach my $num (@_) { if ($num > $biggest) { $biggest = $num; } } my $smallest = shift @_; foreach my $smallnum (@_) { if ($smallnum < $smallest) { $smallest = $smallnum; } } my @unique = ($smallest..$biggest); print "@unique\n"; } &do_it_all(@arr);
[download]

Which prints the following.

$perl nosort.pl 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2 +7 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 5 +0 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
[download]

I was given the following feedback.

That I did not try to address the issue, but circumvented the issue by simply taking a shortcut of finding the smallest number and then the biggest number and generating the numbers.

That I should have done more research and come up with a better solution which would check each number with the preceding one and done some better programming.

This is going to sound funny, but I was told not to use Linux Mint as it is not a "real Linux distro" and that people who do programming/scripting do not use Mint/Ubuntu as they are not "up to the mark". This was the remark that frankly made me suspicious of his line of thought.

I have not till date seen such remarks here by any of the esteemed monks. Can you please shed more light on any of the points mentioned above? Is this kind of shortcuts considered a "non programmer's approach" to problem solving?

Important Update

AnomalousMonk just made me understand the big flaw in my code. I was shifting @_ recklessly without proper checking. Hence if my array contained (1,2,3,4) it would print (I added the $smallest and $biggest lines just to highlight the error):

$smallest: 2    $biggest: 4
2 3 4
[download]

That's because I was shifting off @_, but what I should have done is used $_[0]. If I use that, then I get the right answer. So the code is now changed to:

use strict;
use warnings;
my @arr = (1, 2, 3, 4);

sub do_it_all {
    my $biggest = $_[0] ;
    foreach my $num (@_) {
        if ($num > $biggest) {
            $biggest = $num;
        }
    }

        my $smallest =  $_[0];
        foreach my $smallnum (@_) {
            if ($smallnum < $smallest) {
                $smallest = $smallnum;
            }
        }
            print "\$smallest = $smallest\t\$biggest = $biggest\n";
            my @unique = ($smallest..$biggest);
        print "@unique\n";
        
}

&do_it_all(@arr);
[download]

And the output is:

$smallest = 1   $biggest = 4
1 2 3 4
[download]

If I now try it with the original array: my @arr = (10,10,20,0x47,1,30,45,45), that works as well.

$smallest = 1   $biggest = 71
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2
+7 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 5
+0 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
[download]

I owe all the monks a sincere apology. I should have checked this as a basic safety measure and it should not have been missed. Thanks once again to Anomalous Monk.

Thinkpad T430 with Ubuntu 16.04.2 running perl 5.24.1 thanks to plenv!!

Comment on add missing elements and remove duplicates....and some more questions. Select or Download Code

Replies are listed 'Best First'.
Re: add missing elements and remove duplicates....and some more questions. by choroba (Cardinal) on Apr 22, 2017 at 22:34 UTC
Programming is problem solving. You solved the problem (in a way I like, in fact). If they wanted you to implement a particular algorithm, they should've told you so. Note that your input list doesn't contain hex numbers, though. You included `0x47` to the list, but it's a literal, so Perl will translate it into number during compilation. Try to define the array with strings (as if coming from an input file): `my @arr = qw( 10 10 20 0x47 1 30 45 45 );` [download] Can you translate this into a list of numbers with the same elegance and wit? ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^2: add missing elements and remove duplicates....and some more questions. by pritesh_ugrankar (Monk) on Apr 22, 2017 at 22:49 UTC
Hi Choroba, Thank you for your valuable input. I was reading Learning Perl book and used the logic given in one of the chapters therein to find the largest and smallest number :) I was given the array numbers by the trainer himself. I was told that my answer is right, but method is incorrect. The number 0x47 should resolve to decimal 71 and then I was asked to generate the list in a sequence. I am sorry but I did not understand your question. Could you please elaborate? Thinkpad T430 with Ubuntu 16.04.2 running perl 5.24.1 thanks to plenv!!	[reply] [d/l]
Re^3: add missing elements and remove duplicates....and some more questions. by NetWallah (Canon) on Apr 23, 2017 at 00:54 UTC
Congrats on coming up with a creative and correct solution. One improvement you could make is to get the largest and smallest number in a single pass "for" loop. The point that choroba was making is the subtle difference between the way "hex" representation of numbers is stored. In your code, perl will automatically convert 0x47 into it's internal binary representation, thereby losing the "original" representation. However, if you maintain the quoted string "0x47" as a string (which is what the "qw" operator would do), you could maintain the "original" representation. All this is "extra" since the teacher did not mention any requirement to maintain original representation. Also - Trying to maintain the original - will add a small amount of complexity - you will need to convert to binary, before making comparisons. ~~The simplest way to do that is to add 0 to it~~. ...Disinformation is not as good as datinformation. Don't document the program; program the document.	[reply]
Re^4: add missing elements and remove duplicates....and some more questions. by pritesh_ugrankar (Monk) on Apr 23, 2017 at 17:41 UTC
Re: add missing elements and remove duplicates....and some more questions. by davido (Cardinal) on Apr 23, 2017 at 02:43 UTC
Finding a min and a max can be done in linear time, as can generating a list with no duplicates and no gaps. And the algorithm can be adapted to work in constant space as well. Ask this person who did not care for your algorithm to explain the time complexity and space requirements of the solution that he or she considers better. I suspect this is one of those exercises that is intended to demonstrate the flexibility of a hash, but the problem's specification yields itself better to a more specific, and more optimal algorithm. Dave	[reply]
Re: add missing elements and remove duplicates....and some more questions. by hippo (Archbishop) on Apr 23, 2017 at 08:52 UTC
I was told ... that people who do programming/scripting do not use Mint/Ubuntu as they are not "up to the mark". That is demonstrably contradictory. How does the holder of such opinions believe that such distros are created and maintained if not by "people who do programming/scripting"? I would say that this assertion casts doubt on any other pronouncements that they might make. Always ask for evidence. If you have a distro which enables you to work efficiently (in whichever field) then continue to use it. If someone opines that said distro is not "up to the mark" for that task, ask them to suggest a better one. Try it out (if you haven't done so already) and see for yourself. If you prefer your original choice explain why. That's the only way that both of you will learn.	[reply]
Re: add missing elements and remove duplicates....and some more questions. by AnomalousMonk (Archbishop) on Apr 23, 2017 at 00:58 UTC
Here's another approach. Actually, I don't recommend you use this sort of thing in the "real" world, but it may have some degree of interest. (I'm addicted to regexes.) The main thing I'd like to do is to bring to your attention a basic approach to algorithm development: unit test-based development. Specify a set of test cases, inputs and outputs, write your function, then check that every input yields the specified output. A new test case is easily added to the test set and the whole set can be very quickly retested. The function can be quickly changed and thoroughly retested. See Test::More and friends. (Caution: The example function needs Perl version 5.10+ for regex extensions.) Read more... (2 kB) Update: As an example of a quick-turnaround function re-write, let's say you wanted to use the slightly saner function `sub range { return @_ unless @_ > 1; my ($min, $max) = (sort { 0+$a <=> 0+$b } @_)[0, -1]; return $min .. $max; }` [download] instead of the regex nightmare posted above. Just how many steps to completely retest? Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re: add missing elements and remove duplicates....and some more questions. by BillKSmith (Monsignor) on Apr 23, 2017 at 04:05 UTC
This is going to sound funny, but I was told not to use Linux Mint as it is not a "real Linux distro" and that people who do programming/scripting do not use Mint/Ubuntu as they are not "up to the mark". This was the remark that frankly made me suspicious of his line of thought. Monks disagree on many issues. Most of us welcome opposing views as long as they do not become personal. Bill	[reply]
Re: add missing elements and remove duplicates....and some more questions. by AnomalousMonk (Archbishop) on Apr 23, 2017 at 20:59 UTC
The `do_it_all()` functions in the OP and here use methods for determining the biggest and smallest numbers from the argument list that I've extracted and respectively named `do_it_all_1()` and `do_it_all_2()` below. c:\@Work\Perl\monks\pritesh_ugrankar>perl -wMstrict -le "do_it_all_1(1, 2, 3); do_it_all_2(0, 1, 2, 3); ;; ;; sub do_it_all_1 { print qq{do_it_all_1: (@_)}; ;; my $biggest = shift @_; foreach my $num (@_) { if ($num > $biggest) { $biggest = $num; } } ;; my $smallest = shift @_; foreach my $smallnum (@_) { if ($smallnum < $smallest) { $smallest = $smallnum; } } ;; print qq{smallest == $smallest, biggest == $biggest}; } ;; sub do_it_all_2 { print qq{do_it_all_2: (@_)}; ;; my $biggest = @_; foreach my $bignum (@_) { if ($bignum > $biggest) { $biggest = $bignum; } } ;; my $smallest = @_; foreach my $smallnum (@_) { if ($smallnum < $smallest) { $smallest = $smallnum; } } ;; print qq{smallest == $smallest, biggest == $biggest}; } " do_it_all_1: (1 2 3) smallest == 2, biggest == 3 do_it_all_2: (0 1 2 3) smallest == 0, biggest == 4 [download] Questions: Why does the call `do_it_all_1(1, 2, 3)` return 2 as the smallest number in the argument list? Why does `do_it_all_2(0, 1, 2, 3)` return 4 as the biggest? (The foregoing is intended as propaganda for a Test::More development approach that might have uncovered these inconsistencies earlier.) One other small perplexity. You describe the feedback given you regarding the code of the OP as "[t]hat I did not try to address the issue, but circumvented the issue ..." But the "issue" seems to be described in the paragraph beginning "Given a random series of numbers, ...", and your code seems to address this requirement statement directly. (Others have touched on this.) Can you clarify the dissatisfaction of your instructor/TA? It doesn't make sense to me. (The whole business about Linux Mint/Ubuntu/whatever seems entirely irrelevant.) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: add missing elements and remove duplicates....and some more questions. by pritesh_ugrankar (Monk) on Apr 24, 2017 at 13:31 UTC
Hi, I think for some reason, it's taking the scalar value of the array as the biggest value. I could be wrong, but just a hunch. I tried again and my script seems to work ok. `use warnings; use strict; my @arr = (10,10,20,0x47,1,30,45,45); sub do_it_all { my $biggest = shift @_; foreach my $num (@_) { if ($num > $biggest) { $biggest = $num; } } my $smallest = shift @_; foreach my $smallnum (@_) { if ($smallnum < $smallest) { $smallest = $smallnum; } } print "\$smallest = $smallest\t\$biggest = $biggest\n"; my @unique = ($smallest..$biggest); print "@unique\n"; } &do_it_all(@arr);` [download] I added the `print "\$smallest = $smallest\t\$biggest = $biggest\n";` line for verification, and the output I get is the following: `C:\>perl no_sort.pl $smallest = 1 $biggest = 71 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2 +7 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 5 +0 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71` [download] Thinkpad T430 with Ubuntu 16.04.2 running perl 5.24.1 thanks to plenv!!	[reply] [d/l] [select]
Re^3: add missing elements and remove duplicates....and some more questions. by AnomalousMonk (Archbishop) on Apr 24, 2017 at 16:34 UTC
Hi pritesh_ugrankar. If this was posted by you, then you have recognized the problem I wanted to bring to your attention. If it was not, please run the `do_it_all()` function given here with the arguments `(1, 2, 3, 4)` and see what you get. And just as a general observation, code that looks great and doesn't work is worthless. I've written a lot of code that looked wonderful, but... Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^4: add missing elements and remove duplicates....and some more questions. by pritesh_ugrankar (Monk) on Apr 24, 2017 at 20:04 UTC
Re^5: add missing elements and remove duplicates....and some more questions. by Marshall (Canon) on Apr 25, 2017 at 15:10 UTC
Some notes below your chosen depth have not been shown here
Re^3: add missing elements and remove duplicates....and some more questions. by Anonymous Monk on Apr 24, 2017 at 14:15 UTC
The code looks fine. During edge testing, am having to make 2 line changes. `my @arr = (1,10,10,20,0x47,30,45,45); sub do_it_all { my $biggest = $_[0]; ... my $smallest = $_[0]; ... }` [download]	[reply] [d/l]
Re: add missing elements and remove duplicates....and some more questions. by Anonymous Monk on Apr 23, 2017 at 11:39 UTC
check each number with the preceding one This seems to imply a sort-based approach was expected. But sort has the time complexity `O(n log n)` whereas your linear scan is `O(n)`. It's not uncommon to have TA-s generate teaching problems from random solutions, inadvertently ending up with a few non-solvable ones in the mix. And then, some teachers take it poorly when a student outsmarts them.	[reply]
Re^2: add missing elements and remove duplicates....and some more questions. by pritesh_ugrankar (Monk) on Apr 23, 2017 at 12:12 UTC
Hi, Thanks for the encouraging comments. However. I was not thinking in those very high level complex terms. I just tried to keep it simple. I have not studied algorithms. May be in future I will be able to write this using a cool complex algorithm. Here's how I thought about it. Before writing this script I thought its better to keep it simple so that if I get a large array, then checking an element, say x with the next and previous could be computationally intensive. I will Google about the log n stuff and try to understand it. Thinkpad T430 with Ubuntu 16.04.2 running perl 5.24.1 thanks to plenv!!	[reply] [d/l]
Re^3: add missing elements and remove duplicates....and some more questions. by Anonymous Monk on Apr 23, 2017 at 12:54 UTC
This is the "big O notation". You can read about it on wikipedia: analysis of algorithms. But don't get sidetracked too much with the "dry stuff". Maybe ask for the next problem, or another variant, because you appear to have conquered this one already.	[reply]