zarath has asked for the wisdom of the Perl Monks concerning the following question:
Hello everybody!
Not yet very experienced with Perl, but getting there at a slow and steady pace.
Working on a new code that needs to do the following:
Read $input (.csv), extract the field that comes after the first comma and print each found value exactly once (no more, no less).
It is that last part that I'm having trouble with, I have tried different things, but I will post 2 pretty different versions of what I have tried and what I thought would work, but alas.
An extract of $input:
AgreementId;SalesId;PriceComponentId;ProductTermsId;FromDate;ToDate;Tr +ansType;FixedPrepaym;NoteType C000004923;VK11070778;Delta;;16/08/2017;15/09/2017;Prepayment;Yes;Addi +tional note C000004923;VK11070778;Rounding;;16/08/2017;15/09/2017;Prepayment;Yes;A +dditional note C000004924;VK11070778;Delta Gas;;16/08/2017;15/09/2017;Prepayment;Yes; +Additional note C000858948;VK11070783;Delta;;3/01/2017;2/02/2017;Prepayment;Yes;Additi +onal note C001028127;VK11070844;Delta;;1/07/2017;31/07/2017;Prepayment;Yes;Addit +ional note C000863388;VK11070869;Delta;;14/03/2016;13/04/2016;Prepayment;Yes;Addi +tional note C000863388;VK11070869;Rounding;;14/03/2016;13/04/2016;Prepayment;Yes;A +dditional note C000863389;VK11070869;Delta Gas;;14/03/2016;13/04/2016;Prepayment;Yes; +Additional note C001041275;VK11070873;Delta;;14/04/2017;13/05/2017;Prepayment;Yes;Addi +tional note C000457921;VK11070913;Delta;;11/12/2014;10/01/2015;Prepayment;Yes;Addi +tional note C000457922;VK11070913;Delta Gas;;11/12/2014;10/01/2015;Prepayment;Yes; +Additional note C000354278;VK11070920;Delta;;21/09/2015;20/10/2015;Prepayment;Yes;Addi +tional note C000354278;VK11070920;Rounding;;21/09/2015;20/10/2015;Prepayment;Yes;A +dditional note C001139698;VK11070923;Delta;;12/08/2017;11/09/2017;Prepayment;Yes;Addi +tional note C001139698;VK11070923;Rounding;;12/08/2017;11/09/2017;Prepayment;Yes;A +dditional note C001072986;VK11070933;Delta;;14/03/2017;15/05/2017;Prepayment;Yes;Addi +tional note C001072986;VK11070933;Rounding;;14/03/2017;15/05/2017;Prepayment;Yes;A +dditional note C000833421;VK11074400;Delta;;1/05/2017;31/05/2017;Prepayment;Yes;Addit +ional note C000833422;VK11074400;Delta Gas;;1/05/2017;31/05/2017;Prepayment;Yes;A +dditional note C000833422;VK11074400;Rounding;;1/05/2017;31/05/2017;Prepayment;Yes;Ad +ditional note C000147059;VK11074404;Delta;;20/06/2017;19/07/2017;Prepayment;Yes;Addi +tional note C000147062;VK11074404;Delta Gas;;20/06/2017;19/07/2017;Prepayment;Yes; +Additional note C001109215;VK11074415;Delta;;24/08/2017;23/09/2017;Prepayment;Yes;Addi +tional note C000313157;VK11074418;Delta;;15/11/2016;14/12/2016;Prepayment;Yes;Addi +tional note C000313157;VK11074418;Rounding;;15/11/2016;14/12/2016;Prepayment;Yes;A +dditional note C000313158;VK11074418;Delta Gas;;11/11/2016;10/12/2016;Prepayment;Yes; +Additional note C001099002;VK11074430;Delta;;1/08/2017;31/08/2017;Prepayment;Yes;Addit +ional note C001117234;VK11074441;Delta Gas;;15/06/2017;14/07/2017;Prepayment;Yes; +Additional note C001009800;VK11074443;Delta;;16/11/2016;15/12/2016;Prepayment;Yes;Addi +tional note C000679686;VK11074451;Delta;;20/06/2016;19/07/2016;Prepayment;Yes;Addi +tional note C000679687;VK11074451;Delta Gas;;20/06/2016;19/07/2016;Prepayment;Yes; +Additional note C001242987;VK11074454;Delta Gas;;15/06/2017;14/07/2017;Prepayment;Yes; +Additional note C001080282;VK11074470;Delta;;2/03/2017;1/04/2017;Prepayment;Yes;Additi +onal note C001080283;VK11074470;Delta Gas;;2/03/2017;1/04/2017;Prepayment;Yes;Ad +ditional note C001192414;VK11074473;Delta;;14/07/2017;13/08/2017;Prepayment;Yes;Addi +tional note C001192414;VK11074473;Rounding;;14/07/2017;13/08/2017;Prepayment;Yes;A +dditional note C001192415;VK11074473;Delta Gas;;14/07/2017;13/08/2017;Prepayment;Yes; +Additional note C001192415;VK11074473;Rounding;;14/07/2017;13/08/2017;Prepayment;Yes;A +dditional note C000268914;VK11074478;Delta;;9/10/2016;8/11/2016;Prepayment;Yes;Additi +onal note C000268914;VK11074478;Rounding;;9/10/2016;8/11/2016;Prepayment;Yes;Add +itional note
So I need to write every VKxxxxxxxx once to $output.
Most lines (especially the ones that might be the problem) in my codes have a comment that explains my thinking there.
First version of my code:
use strict; use warnings; use autodie; my $input = 'D:/Some/Specific/Path/To/Input.CSV'; my $output = 'D:/Some/Specific/Path/To/Output.CSV'; open IN,$input; binmode(IN); open OUT,'>'.$output; my $count = 0; while (my $line = <IN>) { chomp $line; # good practice? next if $. < 2; # do not need first line my @fields = split ";" , $line; # define input-lines my @vk = $fields[1]; # extract all VK-codes my %seen; # declare hash to 'memorise' which VK has already been p +ushed my @uniq; # declare array to store unique values for my $vk (@vk) { # need to 'check' all VK-codes in @vk push (@uniq, $vk) unless $seen{$vk}; # only push VK-codes that + are not yet pushed for my $uvk (@uniq) { # uvk = unique VK if ($count != 10) { # we want 10 VK-codes per line -- work +s as intended print OUT $uvk.';'; ++ $count; } else { print OUT "\n"; $count = 0; } } } } close IN; close OUT; exit 0;
After seeing it writes double values (a VK that appears 3 times in $input gets printed 3 times in $output) I stared at this code for a while and realised it might be because I'm printing to OUT inside the while-block and each $line is unique, so I tweaked the code and ended up with version 2:
use strict; use warnings; use autodie; my $input = 'D:/Some/Specific/Path/To/Input.CSV'; my $output = 'D:/Some/Specific/Path/To/Output.CSV'; open IN,$input; binmode(IN); open OUT,'>'.$output; my $count = 0; my @vk; # declare array to store all unique VK-codes my %seen; # declare hash to 'memorise' which VK has already been pushe +d while (my $line = <IN>) { chomp $line; # good practice? next if $. < 2; # do not need first line my @fields = split ";" , $line; # define input-lines push (@vk,$fields[1]) unless $seen{$fields[1]}; # extract all and +only unique VK-codes } close IN; # do not need this anymore for my $vk (@vk) { # we need to print each one if ($count != 10) { # we want 10 VK-codes per line -- works as int +ended print OUT $vk.';'; ++ $count; } else { print OUT "\n"; $count = 0; } } close OUT; exit 0;
Both versions of the code do the exact same thing, print double values, not what I need.
The output from both codes (from posted input).
VK11070778;VK11070778;VK11070778;VK11070783;VK11070844;VK11070869;VK11 +070869;VK11070869;VK11070873;VK11070913; VK11070920;VK11070920;VK11070923;VK11070923;VK11070933;VK11070933;VK11 +074400;VK11074400;VK11074400;VK11074404; VK11074415;VK11074418;VK11074418;VK11074418;VK11074430;VK11074441;VK11 +074443;VK11074451;VK11074451;VK11074454; VK11074470;VK11074473;VK11074473;VK11074473;VK11074473;VK11074478;VK11 +074478;
So printing inside the while-block is not the (only) mistake here. But I can't seem to find my other mistake(s).
Any tips on how to make one of both versions work will of course be appreciated.
Thank you in advance!
|
|---|