in reply to Removing doubles and printing only unique values

Thanks for the replies everyone!

This is the code I eventually ended up with:

use strict; use warnings; use autodie; my $input = 'D:/Some/Specific/Path/To/Input.CSV'; my $output = 'D:/Some/Specific/Path/To/Output.CSV'; open IN,$input; binmode(IN); open OUT,'>'.$output; my $count = 0; my @vk; my %seen; while (my $line = <IN>) { chomp $line; next if $. < 2; (undef, my $vk) = split ";" , $line; push (@vk,$vk) unless $seen{$vk}++; } close IN; for my $vk (@vk) { if ($count != 10) { print OUT $vk.';'; ++ $count; } else { print OUT "\n"; $count = 0; } } close OUT; exit 0;

Minimal amount of changes to my last version of the script and keeps it looking clean.

Have looked up quite a bit about filtering out double values but don't think I have come across that $seen{$vk}++ technique. Will definitely be able to use that more often, seems pretty easy to use and powerful when needing to do something like this.

Replies are listed 'Best First'.
Re^2: Removing doubles and printing only unique values
by holli (Abbot) on Oct 31, 2017 at 14:09 UTC
    I am curious why chose this approach over the functional one by johngg above?

    As you can see, it it is significantly shorter, clearer and arguably easier to understand1, once you wrapped your head around the concept.

    1If you don't believe me, imagine explaining the iterative approach and the functional approach to a 10 year old.


    holli

    You can lead your users to water, but alas, you cannot drown them.

      perl -F';' -ane '$seen{$F[1]} //= print'

Re^2: Removing doubles and printing only unique values
by AnomalousMonk (Archbishop) on Oct 31, 2017 at 15:07 UTC
    my $count = 0; ... for my $vk (@vk) { if ($count != 10) { print OUT $vk.';'; ++ $count; } else { print OUT "\n"; $count = 0; } }

    I think there's a problem with the quoted code:

    c:\@Work\Perl\monks>perl -wMstrict -e "my $count = 0; my @vk = qw(zero one two THREE four five six SEVEN eight nine ten); ;; for my $vk (@vk) { if ($count != 3) { print $vk.';'; ++ $count; } else { print qq{\n}; $count = 0; } } " zero;one;two; four;five;six; eight;nine;ten;
    Where did  THREE and  SEVEN go? Another vote for a functional approach. (But the quoted code is easily fixed.)


    Give a man a fish:  <%-{-{-{-<

Re^2: Removing doubles and printing only unique values
by AnomalousMonk (Archbishop) on Oct 31, 2017 at 15:37 UTC
    ... that $seen{$vk}++ technique.

    This is the technique on which List::Util::uniq() and related functions are based. (uniq() originally appeared in List::MoreUtils, but was duplicated into the core List::Util at some point.)


    Give a man a fish:  <%-{-{-{-<