perl-diddler has asked for the wisdom of the Perl Monks concerning the following question:

Simple demo of my confusion follows. I have lines in an array. Trying to create a new array of uniq lines.

I first deblank and lines goes from 4->3. Then I try uniq, and it goes down to '1', but the lines aren't uniq, instead, they are in '@data' and I don't see why each line is matching in the grep after the first one is pushed on the 'answer' array. I.e. every grep after the first claims that each successive line (all differing) is found in the 'results array', (which after the first match contains the first line). I even print the matching line, and the results array, and they are different, but the claim is that there is a 'match'. Why? What am I missing? Sigh. Thanks...

#!/usr/bin/perl -w use strict; use feature ':5.10'; my @data=( "abcdefg\n", "hijklmno\n", "pqrstuvw\n", "\n"); my $start_lines; sub show ($\@) { $start_lines or return; my ($when,$lines)=($_[0],1+$#{$_[1]}); my $diff = $start_lines - $lines; my $prc = $diff ? 100.0*$diff/$start_lines : 0; printf "%10.10s %d -> %d lines or %.2f%% reduction\n", $when, $start_lines, $lines, $prc; } sub printar(\@) { my $ar=$_[0]; printf "\n(#lines=%d,", 0+@$ar; for (my $i=0; $i<=$#$ar; ++$i) { printf " line %d: '%s'", $i, $ar->[$i]; } print ")\n"; } sub uniq(\@) { my $ar=$_[0]; my @res; foreach (@{$ar}) { my $ans=grep /^$_$/, @res; printf "line \"%s\" in ar?: %s, ar: ", $_ // "undef" , $ans?"yes":"no"; printar @res; if (!$ans) {push @res, $_} } @res; } ### main #my @msgs=<>; my @msgs=@data; $start_lines=@msgs; show("before", @msgs); my $msgs=join "", @msgs; my @deblanked=split /\n/, $msgs; show "deblanked", @deblanked; my @uniq=uniq @deblanked; show "after uniq", @uniq;

Replies are listed 'Best First'.
Re: Braino - why is this not working?
by kcott (Archbishop) on Oct 18, 2010 at 22:37 UTC

    The short answer would be to use the uniq function provided by List::MoreUtils.

    What you have here seems incredibly complicated. Compare with the function I've just mentioned:

    sub uniq (@) { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }

    -- Ken

      I agree, that's an alternate way to do this, but I don't see why the /^$_$/ which successively holds lines from the source, would come up 'true' on the second pass, when the lines don't match.

      Yeah, your example would circumvent the problem what 3-4 people haven't seen so far, but that just makes me wonder why its so hard to see what is going wrong. Something this simple couldn't be a perl bug, it's trivial code, so why is it so hard to see what's broken? GRRR....*knocking head against the wall*...(when I stop it will feel good?)...:-)

        Between the time I logged off yesterday and logged back in today, the discussion has extended considerably and you appear to have your answer.

        In general, I tend to use a lexical variable in for loops and the block form of map, grep and similar constructs.

        for my $list_element (@list) { # use $list_element here } map { ... } @list; grep { ... } @list; sort { ... } @list;

        (for and foreach are synonymous, in case you didn't know)

        This generally avoids any confusion over what $_ refers to.

        Perl 5.10 introduced a new lexically scoped $_ (that's in addition to the current globally scoped $_) so there's even more chance of confusion.

        Hoping your head isn't hurting too much and the wall doesn't need fixing. :-)

        -- Ken

Re: Braino - why is this not working?
by duelafn (Parson) on Oct 18, 2010 at 22:46 UTC

    kcott is right

    Your function fails here:

    my $ans=grep /^$_$/, @res;

    Which $_ are you matching $_ against? :)

    Also, (not relevant in this example) you should protect metacharacters: /^\Q$_\E$/

    Good Day,
        Dean

      It's the $_ from the 'foreach' from the array that is passed.

      Array is passed in that has lines, say,
      aaa,
      bbb,
      ccc.

      Create empty array for results, is 'aaa' in it? No? add to results, next, is 'bbb' in results array? (well I just added 'aaa', and that's it, so results should be no, BUT results are 'yes').

      That's the problem.

      Why is 'bbb' in an array that only contains 'aaa'?

        It's the $_ from the 'foreach' from the array that is passed.

        The closes such construct is the map, not the foreach

        my @a = 1 .. 3; my @res = 0; for(@a){ my $ans = grep { warn ">>$_<< "; /^$_$/ } @res; warn "ans >>>$ans<<< " } __END__ >>0<< at - line 4. ans >>>1<<< at - line 8. >>0<< at - line 4. ans >>>1<<< at - line 8. >>0<< at - line 4. ans >>>1<<< at - line 8.
      Something I need in the larger prog where this is from. Just hadn't gotten there yet, was expecting to have to use some 'quote::meta' cpan type function -- completely forgot about \Q\E -- (never have had the occasion to use them, till now... )...

      Thanks!

Re: Braino - why is this not working?
by suhailck (Friar) on Oct 19, 2010 at 01:29 UTC
    You are using $_ from @res instead of using $_ from outer for loop in grep statement.
    Instead try this
    perl -le '@arr=qw(a b a c); my @res; foreach(@arr) { $__=$_;$ans= grep /^$__$/,@res; print "\$_: $_ ans: $ans";push @res,$_ unless $ans;} print "@res"' $_: a ans: 0 $_: b ans: 0 $_: a ans: 1 $_: c ans: 0 a b c
      Yup -- the grep's use of $_ was over-writing my 'for' loops usage.

      Thanks much for...helping me get this problem out of my head (like a song that gets stuck....). I could work around, it, but till bugged me if ya know what I mean, as it had to be something obvious staring me in my face that I couldn't see...

      Hate it when that happens...

Re: Braino - why is this not working?
by aquarium (Curate) on Oct 18, 2010 at 23:20 UTC
    you're on unix/linux right?
    sort input.txt | uniq
    or if you want the counts of the unique input lines
    sort input.txt | uniq -c
    or is this (perl) homework?
    the hardest line to type correctly is: stty erase ^H
      Um, I'm on linux, but I'm trying to do something similar in perl, since I have a bunch of pattern matching to do that isn't in the problem example.

      Problem with narrowing down a problem is that I throw out other reasons why things were done a certain way, I see it with others "answering"sic people's questions too -- instead of answering the problem, the try to tell the person they should do it a different way -- which is fine in some cases, but about 1/3 the time (or most the time w/my questions), it's because something I'm trying in a larger program isn't working and I've tried to isolate a smaller test case.

      I'm assume anyone who has tried to run it gets the same result, so it's not just my version of perl. So it's just something weird I'm just not seeing that I'll probably feel real dumb for doing once it's out in the open, but for now, I'd like to find my mistake...

Re: Braino - why is this not working?
by jacuro (Initiate) on Oct 20, 2010 at 01:58 UTC
    You must know that in 'grep', the special variable '$_' refers to each current item in @res, which hides the reference to item in @{$ar}. the first time in loop, @res is empty, so $ans is 0, and the first item in @{$ar} is pushed in @res. the 2nd time, @res contains that first item, grep select every item that equals itself. there's 1 item , and it equals itself, so $ans is 1, no push. the 3rd time,@res still contains 1 item, still no push
    ...
    @res always contains the that first item.